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CHAPTER I 


INTRODUCTION 


“THE county superintendent shall provide for the examination of 
all applicants for graduation in the common school branches from town- 
ship, district, or town schools during the months of March, April, and 
May, and furnish them certificates of graduation, if in the judgment 
of the county superintendent they are entitled thereto, which shall en- 
title the recipients to enter any township, town, or city school of the 
state if they be otherwise entitled to the privileges thereto.”* 

To meet the requirements of this law the County Superintendents’ 
Association, in codperation with the Indiana State Department of Public 
Instruction, developed a plan of using, all over the state, the same ques- 
tions for these examinations. To carry out this plan the president of 
the County Superintendents’ Association would appoint a committee of 
county superintendents to make out questions for the annual examina- 
tion in the various subjects. This committee would then divide the 
work so that approximately one subject was given to one county super- 
intendent, over which he was to make the questions that were to be 
used thruout the state. A copy of these questions was then sent to the 
State Department of Public Instruction. Thus, thru the kindness and 
codperation of the State Department of Public Instruction, these ques- 
tions were then printed and sent out to all counties of the state. These 
questions were used by the various counties for the final examinations. 

This practice seemed to be a step in advance of the method whereby 
each county was a unit unto itself, and for a period of years it served 
a real purpose thruout the state. 

As methods of teaching and testing procedure advanced, the county 
superintendents became dissatisfied with this method of measuring the 
achievement of the eighth grade pupils. Wherever consolidations had 
taken place and trained people took charge of the consolidated unit, this 
method of testing was discontinued. The county superintendents, how- 
ever, felt that they still desired some check or measurement of the 
achievement in the rural schools and, altho they were dissatisfied with 
the older method, they continued to use it in lieu of a better type. 

To determine whether a given set of questions taken from an eighth 
grade test would give the same or nearly the same results if graded by 
several people, a list of such questions was given to a group of eighth 
grade pupils who were in the last semester of their eighth year. These 
papers were graded by 33 graduate students, in Summer Session at 
Indiana University, who were equal in ability and training and almost 
equal in experience to that of county superintendents. The graders in 
this case were given only two instructions: (a) to mark the papers 
as they desired, (b) to count 70 as a passing mark. The marks for 


Acts of Indiana, 1903, p. 291, Sec. 6387. 
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each pupil were then listed, and the average deviation was found in 
marks for each pupil. The results from this study are given in Table I. 


TABLE I.—Averace Mark anp AVERAGE DeviATION IN Marks or Puprms 
on History Examination, 1926 








| 


- Average Number 
Pupil Average Deviation of 
Mark from the Graders 


Average Mark) 


/+——_——— 








BORGER SOL BS Sa ateiae ese 79 8 33 
es cso aids RAGE Sela 82 il 33 
i Ri re | 54 14 | 33 
EP AAS TRAD Te 96 6 33 
| REGIS Sig ty Med eegD 90 8 33 
| EOE ER ee 88 11 33 
ist. 3. jase s aes abides a 93 | 8 | 33 
tin ai tls ek dian ile 87 12 33 
Ricans pnantapductabianvitesthe $4.20 89 7 33 
Boe SN oat ere Bef 86 | 9 33 
i Uv ds LAU, ds HIS Ge tilde 89 9 33 
in itarais wis bd niedinn +allin wii aieat | 91 7 33 
Nei RAG Sek 6 2 RPE et: 93 8 33 
Rca ht is vads naan aie eet 77 13 | 33 
GAA IA TPR Pe A 69 14 33 
Di < cds dteicy. tad sbasesaancs x 78 12 | 33 
RS eet aac PERE 84 10 33 
a Fen laepienteclincn tire Mince Ag emery icte 80 11 | 33 
_ SOAS OE iT ag a a 91 8 33 
VA Hs0Gs, Dee alk oF fees 83 9 33 
Weed) «..nchenins cic Maw dies Hapeing 64 12 | 33 
EE 
Pr NR a aa ard bs vas wie 4 i SRP ear a igte” 9.9 | 33 


It seemed apparent from these results that the same paper would 
be graded very differently by different county superintendents if these 
students had been county superintendents. In view of this very great 
difference in marks, it seems fair to assume that a pupil’s paper would 
not receive the same mark if it were graded by different county super- 
intendents. 

A study was then made of the questions of a given year to deter- 
mine the comprehensiveness of the examinations given. In the ap- 
pendix are given examples of these questions. A study for one year 
revealed that, in the eight subjects, not including music and art, that 
would be taken by all pupils, a total of 177 items were included. The 
examination items were always listed under eight questions for each 
subject. However, under some of the numbered questions several items 
were included. As an example, in the reading examination was this 
question, “Name five selections you have studied this year and the 
author of each.” The number of items in this question was counted 
as ten. 

The pupils taking the examination were instructed to write answers 
to any six of the eight questions given on each subject. This would 
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mean that each pupil, on the average, would cover approximately three- 
fourths of the 177 items, or 132 items in the eight subjects. The total 
number of items in each subject, and the approximate number to be 


covered by the pupil who answered all of the six required, are given in 
Table II. 


TABLE II.—Nomser or Items 1n Eacu Set or QuesTIONS WITH APPROXI- 
MATE NuMBER REQUIRED TO ANSWER 

















Approximate Number 
Number of of Items Covered 
Subjects Items if Six Questions 
Included are Answered 
pS es ee 25 18 
ES +s 3 tc ok OST cas ona ene 25 18 
ee Ae A ae 17 12 
United States History............. 13 9 
CRVeGG rie. ROA ALITY, 11 9 
PN oe sS3c155.0) pick. Xe 17 12 
SS SORES SIT Ba te oo ne 29 21 
EE: Bo Sox cis ee tiga es cee aa 40 30 








It is to be remembered that, by a pupil’s selection of the six ques- 
tions that he would answer, the number of items covered might vary 
from the approximated number given in Table II. However, the results 
indicated that the pupil would answer such a small number of ques- 
tions as to fail to cover comprehensively the fields of the various marks. 

An analysis of the items included in the examinations showed, in 
some cases, a very biased sampling of the material covered by pupils in 
the seventh and eighth grades. This analysis was made and is on file 
with other materials of the study. However, an example from a list of 
questions used in one of the examinations just referred to (May, 1926) 
will illustrate. Of the 13 items included in the eight questions on United 
States history, one refers to colonial settlements, one to the Constitu- 
tion and Articles of Confederation, one to the Erie Canal, one to the 
Monroe Doctrine and the Missouri Compromise (a comparison), two to 
the “spoils system,” two to the Stamp Act, and one to the panic of 1837. 
All of these nine, in time, are definitely before 1860. One item refers 
to the panic of 1873, and three items refer to presidential electors. 
All of these 13 items are covered in the elementary course of study in 
Indiana before the beginning of the last one-half year of United States 
history. If the pupil were to choose six of the eight questions, he might 
well answer no question or cover no item in history later than the Mis- 
souri Compromise. 

Six sets of United States history questions, two each for the years 
of 1924, 1925, and 1926, were studied to determine the distribution in 
chronological time of the items given in these examinations. Not all 
of the items could be said to belong to any particular period of history. 
These items were omitted from this study. Table III gives this dis- 
tribution. 


3—45607 
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TABLE III.—Distrwvtion or Irems Accorpinc To Periop or UNITED 
Srates History 














Period Number of Items 

1. Discovery and Colonization....................... . 16 
eg a en: err Te 12 
a I ds ota a wad geek 5 
i Eg kt. a be waied saunesed oe 20 
een ROTO TR ee 3 
RS ee ae rr ee 0 
Bi IS 6, ds woW eRe so kde anes cow oe ada 3 
ie SE EME Lk o La caeebachik se dcp cea baubi ee nvek 2 

sa Ac Ca bates KORE Kiko tan ke a ee ounes beans, 60 











The bias in these questions is partially, and in some cases wholly, 
due to the limited sampling made possible by this type of an examination. 
To determine what other states were doing in relation to the ques- 
tion of examinations for pupils completing the elementary school work, 
a questionnaire was sent to the State Departments of Education in the 
various states. Three major questions were asked. 
1. What use is made of new-type or standardized tests? (Achievement 
or intelligence) 
2. In the question of promotion, what weight is given the recommenda- 
tion of the teacher? 
3. Do you have examinations based on questions sent out from the 
State Department? 


The results of this questionnaire for the 38 states from which re- 


plies were received are shown in Table IV. A copy of the question- 
naire will be found in the appendix. 
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TABLE IV.—Tue Practice Concerntnc Metuops or Promotine EIGHTH 
GRaAbDE Pupits In Various STATES 























Do you have 
Do you use What weight | examinations 
State standardized is given based on State 
tests? recommendation}; Department 
of teacher? questions? 
Ee eee Some counties | Varies with 
county No 
BAe OAS None 100 per cent No 
3. California........... Locally, es- 
pecially cities No 
+: RR ne kn acnke Ste cd Some Largely No 
Ses cae este ce ee No Wholly No 
Rg © None Yes 
Fi. GENS seh alee kane No Varies Yes 
Oi SO cs Poets No statewide | May give 50 
per cent Yes 
9. Kentucky........... A few counties | Mostly ; No 
0. . Loumiems.....<...--<} Ae Wholly No 
11. Massachusetts....... Wholly, unless 
doubtful No 
12. Michigan..........;. None Varies Yes 
Rs SII asesasies ts, 0 sisks saniied None Considerable Yes and No 
ee ™ ae Some No 
i rer None Very little Yes 
16. Minnesota........... No information | No information| Yes 
17. Mississippi.......... None Almost total No 
1B, DROOBORS 65 cocssctces Yes In doubtful cases} Yes 
Ame da Rt Some No plan, some | Yes 
20. New Hampshire..... ' No 
21. Nebraska............| No None Yes 
eS” ae ea Yes 
23. New Jersey.......... | Some Yes 
24. North Dakota....... | No None Yes 
26.. Oklahoma..........: None Varies Yes 
, a ee ee settee Locally 3344 per cent Yes 
27. Pennsylvania........ Locally No 
28. Rhode Island........ Locally No 
29. South Carolina......| Locally 3314 per cent No 
30. South Dakota....... Locally Differs, 50 per 
cent Yes 
i I uh eas wad ince’ In cities Wholly No 
Me SEM co was htsk saeee Locally Principally No 
ae ees pee Locally Principally No 
34. Virginia........... .| Locally Principally No 
35. West Virginia........| No Yes 
Ss IS oo vy ok cues Yes Yes 
Sf: Uniknewa. 0000 8 None | Exemptions Yes 
SB. Yndlemaciuc. 102 icive No | Some Yes 








It will be noted from this table that five of the states who use 
state examinations use some new type of standard test. In none of 
these states had it at this time become state-wide in all subjects. 

From this preliminary study of marking, comprehensiveness, sam- 
pling, and the practice of other states, it seemed evident that the old 
type of examination did. not meet adequately the needs of the state of 
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Indiana. The findings of this preliminary study were in accord with 
the thought expressed by some members of the County Superintendents’ 
Association and the State Department of Public Instruction. 

A committee made up of representatives of the County Superintend- 
ents’ Association, the State Department of Public Instruction, and the 
Bureau of Codperative Research of Indiana University met and dis- 
cussed a proposal to change the type of examination. This committee 
decided to replace the essay type examination with some form of a 
new-type or standard test for use in seventh and eighth grade promo- 
tion. The committee then drew. up the principles to control the admin- 
istration, development, and use of the new-type examination, to be known 
as the “Indiana Composite Achievement Test.” 

The following principles were agreed upon by the committee and 
were later adopted by the County Superintendents’ Association :* 


1. That the Indiana Composite Achievement Test be based on the Indi- 
ana state course of study and state adopted textbooks." 

2. That the test consist of nine subjects: arithmetic, American history, 
Indiana history, civics, language, reading, geography, physiology, and 
spelling. 

3. That the test be assembled in a booklet form similar to that of any 

composite test, in such a manner that the scores for each subject as 

well as a composite score for all subjects can be determined, also 
that directions for using, keys for scoring, and norms for interpre- 
tation be furnished with the test. 

That two equivalent forms be developed for each year. 

5. That (a) the county superintendents be provided with this test at 
cost, which shall not exceed 12 cents a copy‘; (b) that the test be 
paid for by the time of distribution by the various county superin- 
tendents; (c) that the State Department of Public Instruction bear 
the cost of shipping the tests to the various counties. 

6. That the county superintendents codperate with the Bureau of Co- 
operative Research in experimental try-outs of the test material. 

7. That it is distinctly understood that the adoption of this new type 
of examination would in no way obligate the county superintendents 
to its use if they are using or care to use some system of promotion 
of eighth grade pupils into high school, other than the state examina- 
tion. Nor will any attempt be made to establish certain standards 
to be met before the pupils can be promoted. It is the duty of the 
county superintendent to determine who shall enter the high schools 
under his jurisdiction and no attempt will be made to usurp this 
power. It is intended only that the new-type examination be used 
as a substitute for the old examination and nothing more. 


oe 


Thus evolved the first major problem of this study—to develop a 
new-type composite test for seventh and eighth grade promotion sub- 
jects. 


? From minutes of the County Superintendents’ Association. 

*This was further interpreted to mean that, in cases where the state had not 
adopted a single book, the various books used should be studied to determine, if 
possible, common content. 

* The actual cost was found to be 8 cents per copy. 
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Summary 


1. The state of Indiana had for some time found it advantageous 
to use state-wide examinations for promotion from the elementary school 
to the high school. 

2. The marking of the same essay type examination by different 
graders showed a wide variability in marks given. 

3. The number of items to be covered in an essay type examina- 
tion, of the type used in Indiana, was very small. This did not pro- 
vide an extensive sampling of the subject-matter covered. 

4. The sampling of questions or items on the essay type examina- 
tion, of the type used in Indiana, was in many cases decidedly biased. 

5. Other states that give state-wide examinations are using to a 
limited deyree new-type examinations or standard tests. 

6. The County Superintendents’ Association voted to replace the 
essay type examination with a new-type or standard test to be known 
as the “Indiana Composite Achievement Test.” 


4—45607 











CHAPTER II 
THE DEVELOPMENT OF A COMPOSITE TEST 


THE first purpose of this study was to develop an improved and 
practical plan for seventh and eighth grade promotion in Indiana rural 
schools by a new-type examination or standard test. 


The Selection and Validation of the Items of the Test 


The original items selected for possible use in a standard test should 
be based upon the material as nearly common to the experience of all 
the pupils taking the examination as possible? In this study it seemed 
possible to obtain the items that might be considered common to the 
pupils of the rural schools of Indiana by using three sources: (a) the 
state courses of study, (b) the state adopted or most used textbooks, (c) 
the content of previous old-type examinations given to these pupils dur- 
ing the last three years—1924, 1925, and 1926. Three lists of questions 
were used each year. This made a ‘total of nine sets of examination 
questions. 

Committees composed of teachers, consolidated school principals, and 
county superintendents analyzed, under the writer’s direction, the con- 
tent of the course of study and the state adopted or most used textbooks 
in the various subjects. Those items included in the last two years of 
elementary school work in each subject were then listed to be considered 
by the committee in conference with the writer. The writer analyzed 
the content of the eight sets of essay type examination questions used 
in 1924, 1925, and 1926. All of the different items resulting from the 
analysis of the ccurse of study, the textbooks, and the examination 
questions were then compiled into one group for each subject. 

A list was then made of the objectives stated or inferred in the 
course of study for each of the nine subjects for the last two years of 
the elementary school. It was at once recognized that not all of the 
objectives of the teaching of any subject could be directly measured. As 
an illustration the following example is given. 

The following objectives or aims are given in the Course of Study 
in Language, under “Oral Composition” :? 


“1. To enrich the pupil’s experience and to help him to communicate his 
ideas truthfully and pleasantly. 

“2. To teach the pupil how to study indgpendently and how to codper- 
ate in socialized language activities.” 


1The methods described are those used with all forms of the test developed, unless 
otherwise stated, altho the illustrations used may be from any one of the four forms, 
A, B, C, or D. Forms X and Y have since been developed. 

? Manual with Course of Study in Language, Grammar and Composition for the 
El tary Schools of Indiana, Department of Public Instruction, Bulletin No. 47D, 





1026, p. 178. 


(16) 
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It was not thought expedient to attempt to measure these aims di- 
rectly in a pencil and paper test. 
In spelling the following five aims are given:* 


“I. To develop the ability to spell correctly.commonly used words that 
are in the vocabulary of the pupils of each grade. 

“II. To develop habits of correct spelling so that the arrangement of 
the letters in their proper order becomes automatic and almost in- 
stantaneous. These habits of correct spelling should include the 
ability to recognize at once the correctness or incorrectness of a 
word—a spelling consciousness. 

“TII. To develop an appreciation of the value of correct spelling. This 
is sometimes called a spelling conscience—a ‘something within’ that 
is hurt or annoyed by incorrect spelling. 

“IV. To develop an appreciation of the value of checking and recording 
one’s progress, of watching and improving one’s self. 

“V. To develop a technique for the study of spelling, including a method 
of attack, a knowledge of how to use the dictionary, and the appli- 
cation of a few inductive rules governing word formation.” 


In the spelling test developed it was the intention to measure di- 
rectly only the first aim, “To develop the ability to spell correctly com- 
monly used words that are in the vocabulary of the pupils of each 
grade.” 

In the light of the objectives to be measured, the writer discussed 
with the various committees the validity of items in a test. Each mem- 
ber of each of the various committees then selected from the total items, 
obtained from the analysis of the courses of study, textbooks, and exam- 
ination questions, those items which he considered might well be used 
in the test. This was done by having each member cross out on his 
list of items “those that you consider would not be valid as a measure 
of achievement of seventh or eighth grade pupils.” The writer served 
as a member of each committee. Any item crossed out by any mem- 
ber of the committee for a given reason was then dropped from the 
list. 

The total number of items remaining for each subject is given in 
Table V. 


* Spelling Manual with Course of Study for the Elementary Schools of Indiana, 
Department of Public Instruction, Bulletin No. 47G, 1926, p. 3. 
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TABLE V.—Torat Numper or Items SELEcTED FoR USE IN THE INDIANA 
Composite ACHIEVEMENT TEST BY THE VARIOUS COMMITTEES 


x 











Subject Number of Items 
MO akike RE BLES. og HA ads Sok Poi bs5 SEE 96 
NS EEE REE OR TOOL HO 203 
i ak Ss cares ak, car dnd «bn oo alain achie oh 114 
SURG ht an cen, tae aha aac t vreaeadooreerenees es 1454 
SEN a S55 v2, FS, NACE Ea ee lot. tee owes deeee 233 
IND wins S'S d Oi wo % lace Oe oo kic Rade ER RES 389 
EPR 5 i SA ae epee s mee oP PR DE 230° 
aaa SR RB he AES iy Sh gp EK A Sree Sree ae hella pag 130 
IES Sereda kc COSTS Te ae Ba Paha BAN ke 345° 








The items were then turned into new-type questions." The mem- 
bers of each committee were given a list of rules and suggestions to 
guide them in this phase of the work.® The new-type questions thus 
formed were discussed with the members of the committees and changes 
were made to adjust better the type of question to the item tested upon. 

These new-type questions were then divided into two equal groups 
for each subject, by the judgment of the various committees, for the 
purpose of making each list of questions cover somewhat nearly the 
same phases of the course of study. For example, the language items 
that had to do with usage were so divided that each would have the 
same number of examples and cover approximately the same type of 
usage altho the material was different. 

The questions were then typed and copies were sent to grade school 
principals asking them to have their seventh and eighth grade teachers 
go over the items of these forms and to put a cross before items that 
they “did not consider as good questions or items to include in an exam- 
ination covering the last two years of this subject in the elementary 
school.” Only a few judgments were obtained on these items in this 
manner; however, it served as a check on the work of the various com- 
mittees.” All questions marked in the manner described above were 
dropped from the list. The questions were then carefully checked by 
the writer both for form and content. The number of questions or 
items remaining for use in try-out form is given in Table VI. 

"6 The original items were selected by analysis of five most used texts. Items common 
to three of these texts comprised the items from textbooks. 

5The number of items was not complete as it did not include the questions on 
the selections to be used in the comprehension test. 


®*These represented words selected from the state adopted text by a method of 


random selection. The number was not reduced by the method used by the other 
committees. 


‘For reasons given, it was deemed advisable to use to a very limited degree the 
true-false question. 

’ The suggestions given were taken from Patterson, Donald G., Preparation and 
Use of New Type Examinations, World Book Company, 1926; and Ruch, G. M., The 
Improvement of the Written Examination, Scott, Foresman and Company, 1926; with 
further suggestions given by the writer. 

* The list of spelling words were not validated in this manner. 
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TABLE VI.—Torat NumsBer or Questions Usep 1n Try-Out Test 














Subject | Number of Items 
ROE PON SHOE TAPS Fol spe hp oeaice: yt 90 
REA TAOES 6 6.6 01. 0 oT ae RR IIS 189 
SR TEN. Mog oa. Seek G95. Adie ee eo er Pees 110 
ES a a sa sink's shake ase li elnambanelen > ietibae 138 
Geography........... ME AG SN Pa etre ME vat 228 
eS ee Nels idan SAL AE a eae es 367 
BIE attr Sas Seis ae ew cok eee bee ed Oe ae 228 
FORME, ua stead eee ees Cee 126 
RUN oe se a8, henck coalns CR ata sn et eee eee 173'° 











Preparation and Giving of the Try-out Form 


Due to the fact that many of the persons who would use the Indi- 
ana Composite Achievement Test would not be familiar with standard 
tests or new-type examinations, the following general principles were 
used: (a) that the responsibilities of the examiner be reduced to a 
minimum, (b) that the pupil be given an opportunity to become ac- 
quainted with the new type of examination before taking it, (c) that 
types of questions requiring corrections in scoring be reduced to a 
minimum, (d) that the test be given in the order which would cause 
the best adaptation of the examination to the pupil. 

In view of these principles the test in the try-out form was or- 
ganized and given as nearly as possible like the final form to be used. 
The test was also organized or edited so as to have all the directions 
and examples to be read by the pupil only. The examiner’s work under 
this method was reduced to the giving of general directions concerning 
filling out the information blanks on the test, ete., and the keeping of 
time. This put a rather heavy premium upon ability to read and under- 
stand the directions of the test. 

A practice test was sent out several days before the composite test 
was given. This practice test included new-type questions like, in form, 
to those found in the Indiana Composite Achievement Test. The vari- 
ous teachers used these to show pupils how to answer or mark the 
new-type questions. This same method was used with the final forms 
when they were given. 

Very few alternate response type questions were used in any of 
the forms of the test developed. This was done intentionally for two 
reasons: first, to prevent having any corrections to make in scoring, 
since with many people this would not be understood and would com- 
plicate scoring; and, second, because the best method of scoring the 
true-false type of questions is as yet not definitely determined.” 

In view of the fact that the pupils who would take the test would 
be somewhat unacquainted with this type of test and also that a new- 

#” Selected by random sampling. 
™ Ruch (36), Wood, B. D. (46), May (27), Paterson and Langley (83), Holzinger 


(18), have made extensive studies of this question. Their findings do not agree as to 
the best method of scoring such tests. 
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type or standard test in spelling or arithmetic is not so unlike the ordi- 
nary type of test in these subjects, it was decided to give these tests 
first. The spelling test was given in approximately the same order as 
such tests were given in the classroom each week.” This test was fol- 
lowed by the arithmetic test which, in so far as the child’s work is 
concerned, is not much unlike the old-type test. 

The two forms of the new-type test, equated as to number of ques- 
tions, type of questions, material covered, and relative difficulty, as 
judged by the committees, were then mimeographed and sent to the 
county superintendent of schools in 21 counties, well distributed over the 
state, on October 1, 1926." This try-out test was given to pupils in 
the ninth grade who had completed the eighth grade work the previous 
year in a rural one-, two-, or three-room, or consolidated school. These 
tests were given by the consolidated school principal or the county su- 
perintendent of schools. 

So that the study of the two groups of questions might be made 
with a knowledge that the groups of pupils taking them were of equal 
preparation and ability, two methods were used to obtain equated groups 
taking each of the forms or groups of new-type questions. In one case 
the two forms or groups of questions were given at the same time and 
every other pupil took one of the groups of try-out questions and the 
other pupils took the other group of questions. In the other case all 
the pupils were given the first group of questions and a few weeks 
later the same pupils were given the other group of questions. Another 
method, which now seems to the writer to be a better one, might have 
been used. This method would have been to give all the items found 
in both groups of questions on one-half the subjects at one sitting and 
a few days later to give all the items on the remaining subjects to the 
same group of pupils. ' 

It was necessary to try to obtain information of three kinds on 
this try-out test: (a) time limits on the various subjects or parts of 
the test; (b) the adequacy of the instructions and the clarity of the 
items as shown by the pupils’ responses, pupils’ criticisms, and the sug- 
gestions of those giving the test; (c) more knowledge concerning the 
validity of the items of the test. 

To obtain the time limits for each of the tests’ two methods were 
used: (a) The examiner instructed the pupil to hold up his hand when 
he had finished all the items in each given subject in the test. The 
pupil was then to close his booklet. The examiner started all of them 
together and took the time at which 90 per cent of the pupils had 
fmished as the total time. The record was sent in. (6b) The examiner 
started all pupils together and at the end of a certain number of min- 
utes told the pupils to write a “1” on the margin of the paper; at inter- 
vals of two minutes following they were told to write a “2,” a “3,” etc., 
on the margin of the page where they were working, until a certain 
time limit was reached and all were to stop. This time limit varied 

22 The test-study-test method of spelling is in use in this statc. 
% October 3 and 17, 1927, for Forms C and D. 
4 These tests later became Forms A and B for 1927 and Forms C and D for 1928. 
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from subject to subject. From these two measures of number of items 
per unit of working time, it was possible to determine the length of 
time necessary for approximately 90 per cent of the pupils to answer 
all the items on the test. The second method proved to be more accu- 
rate. To try to get further criticism on the method of administration, 
the instructions, and the clarity of statement of items on the test, each 
examiner was asked to criticize those factors on the copies of the try-out 
test. He was further instructed to ask the pupils at the close of the 
test to tell him of parts of the test they did not understand. This 
applied both to instructions and questions. From the judgments given 
by the examiners and pupils who took the try-out tests some changes 
in instructions were made and a few items were dropped from the test. 

The two groups of test items were then scored. The correct re- 
sponses as predetermined were counted as right. Responses that did 
not agree with the predetermined response, but that seemed to the scorer 
might be counted as correct, were checked. Definitely wrong items were 
so marked. The responses that were possibly correct were gone over 
carefully by the writer. Where such a response was such as could be 
counted correct it was so credited. The final authority on any response 
was the state course of study and the state adopted or most used text- 
books. If the item was finally used, the scoring key made provision for 
more than one correct response or changed the question to make pos- 
sible only one correct response. 

The per cent passing each test item was determined. Practically 
all pupils attempted to answer each item. However, the per cent pass- 
ing each item was determined by dividing the total number answered 
correctly by the number that had attempted to answer the item. It 
is worthy of note in passing that, altho the items had been arranged in 
what seemed to the committees the order of increasing difficulty, the 
results obtained in the per cent passing each item or question did not 
bear out this judgment to any marked degree. 

The sehool principal in each of the various schools in which the 
try-out forms were used was asked to indicate what he considered the 
best 10 per cent of the pupils of the group and the poorest 10 per cent 
of the pupils of the group. This meant that approximately two of 
the poorest and two of the best pupils in each group were indicated, 
since the groups varied in size from 15 to 25 pupils. 

A study was then made of the per cent of the “good” and the 
“poor” groups passing each item. These two groups represented ap- 
proximately the best 10 per cent and the lowest 10 per cent of the 
pupils according to the judgment of the consolidated school principals. 
In all there were 40 pupils in the “poor” group and 44 in the “good” 
group. All items in* which the per cent passing of the “good” group 
was not greater than that of the “poor” group were eliminated. This 
measure of “goodness of items” threw out many of the easier ques- 
tions and questions that were evidently ambiguous in statement. This 
method obviously threw out all questions passed by 100 per cent of all 
the pupils and those passed by no pupils. : 

The remaining items of the two groups of tests were then consid- 
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ered as one group of items. The average per cent passing all items, 
the highest per cent passing any item, and the lowest per cent passing 
any item were calculated. This is, in a rough way, a measure of the 
difficulty of the test. While the exact average per cent passing all items 
for a good test is not definitely determined, it should probably be near 
50." Something of the range of difficulty is shown by the easiest and 
most difficult items. The total number of items remaining for each 
subject to be used in the final forms of the test, and the highest per 
cent passing any item, the average per cent passing all items, and the 
lowest per cent passing any item are given in Table VII. 


TABLE VII.—Torat Numper or Items Per Supject REMAINING AFTER THE 
MeruHops or VALIDATION Hap Been Usep witH MEASURE OF RELATIVE 
Dirricutty or Items By Svussects, INDIANA ComposireE ACHIEVEMENT 
Test, Form A 


Number | Highest | Average 


| Lowest 

Subject of | Per Cent | Per Cent | Per Cent 

Items | Passing | Passing Passing 

any Item | Any Item | Any Item 
Arithmetic............. 61 9 8 86| 8656.3 16 
American History....... 156 88 52.5 7 
Indiana History......... 103 98 50.5 12 
0 NS ae ere ~ 92 98 54.0 15 
Geography..... ree 156 94 62.5 14 
Language... ie 232 98 60.7 12 
Reading. ... 198 98 57.9 15 
Physiology......... 96 98 74.8 18 
Spelling. ..... Fh 100 98 66.5 24 


From a study of the highest per cent passing any item, the low- 
est per cent passing any item, and the average per cent passing all 
items, it seems that the items were on the whole too easy. This proved 
to be true with two of the subject tests of this series,—i.e., physiology 
and spelling. When the distributions were made for a large number 
of cases, taking the two tests in the final forms, the distribution was 
badly skewed to the right. All the distributions except American his- 
tory and Indiana history had a positive skew. These two had a nega- 
tive skew when distributions from a large number of cases were used. 
The time element would be a factor in determining the distribution. 
However, since the time on these tests was set at a point where ap- 
proximately 90 per cent would finish the test, there was need for more 
difficult items to be included.” 

All of the items for each subject were then separated, according 

% Symonds, Percival M., Measurement in Secondary Education, p. 301. The 
Macmillan Company, 1927. 

1% An attempt was made to have a few more difficult items in Forms C and D, 
that were developed for use in 1928, but this was not accomplished. The use of a 
shorter time limit would be necessary to prevent the skewness of distribution with 


these items. This could. however, not be applied to spelling, which has caused the 
most difficulty due to ease of items. 
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to the type of questions, into two groups of equal difficulty. The per 
cent passing each item was used as a measure of the relative difficulty. 
For example, for the year 1927, of the 48 multiple-response-single-choice 
type of questions in geography, in which the response was a single word 
or phrase, 24 were used in one form of the test and 24 in the other. 
The items were separated into these two groups by putting the easiest 
question in one group, the next hardest in the second group, the third 
hardest in the second group, and the fourth hardest in the first group, 
etc., until the questions were put into the two groups. This made the 
two groups of multiple-response-single-choice type of: questions in this 
subject test of equal difficulty on the basis of the results of the try-out 
of the items or questions. One of these groups of 24 items was a part- 
of what became Form A and the other group of 24 items a part of what 
became Form B of the geography test. This same procedure was fol- 
lowed thruout the whole number of items for the test and all others that 
made up the composite test. 

All subjects having been divided into Forms A and B, with ques- 
tions of equal difiicuity, it became necessary to check each form against 
the other to determine whether each subject on each form covered ap- 
proximately the same phases of the subject. This check on the sam- 
pling and comprehensiveness of the test made it necessary to change 
some items or questions from one group to another. 

The test items on each form of each subject were arranged in 
order of increasing difficulty within the type of question,—that is, 
matching questions were arranged together in order of difficulty, mul- 
tiple-response questions together, etc. It is a generally accepted rule 
that the test should begin with the easiest item or question and the 
questions should be placed in order of increasing difficulty. If this rule 
were carried out strictly, the first question would be easiest, the next 
one harder, and so on until the last question would be the most diffi- 
cult, regardless of the type of question. In this way, it would have 
been necessary to place a matching type of question directly after a 
multiple-response-single-choice type of question. This could not well be 
done. Therefore it was thought advisable to put questions of each type 
in order of increasing difficulty, and then to put these groups or biocks 
of questions of different types in order of increasing difficulty. The 
average difficulty of each block of questions was taken as the difficulty 
of that type. This would mean that often the last question of one type 
of question was more difficult than the first question of the block ques- 
tions that followed it. The following example will illustrate. 

In Indiana history, Form C, there -were three types of questions: 
(a) multiple-response-single-choice, (b) rearrangement, (c) matching. 
The per cent passing the multiple-response questions ranged from 90 
to 30. The per cent passing the rearrangement questions ranged from 
87 to 24. The per cent passing the matching questions ranged from 
78 to 17. The average per cent passing each group was 62, 55.8, and 
48.4 respectively. The groups of questions were thus arranged in such 
an order as to have increasingly difficult questions in each group and 
increasingly difficult groups. 


5—45607 
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Reliability of the Subject Tests of the Indiana Composite Achievement 
Test, Form A 


The coefficient of reliability of each test in Form A was determined 
by giving this test to 40 pupils then completing (at mid-year) the eighth 
grade in a city school. The coefficient of reliability of Form B was not 
determined at this time, but was later determined when reliabilities 
were calculated with a larger number of cases with both Forms A 
and B." 

The coefficients of reliability of the various tests of Form A were 
calculated primarily to determine whether all subject tests taken to- 
gether as a composite test would give a coefficient of reliabiiity sufii- 
ciently high to be used as a basis for individual promotion. This 
coefficient of reliability has been generally accepted to be .90 or above.” 

The coetficients of reliability of the various tests were determined 
by the even vs. odds method. The coefficient of reliability of the com- 
posite of these tests was determined (a) by adding together all the even 
scores made on each of the nine subjects by a pupil and (6) by adding 
together all the odd scores made on each of the nine subjects by a pupil, 
then correlating these two resulting numbers for all pupils. These re- 
sulting correlations were then corrected by the use of the Spearman- 
Brown formula. ‘The correlations determined in this manner are given 


in Table VIII. 


TABLE VIII.—Coerricrents of ReiiaBiLiry or THE NINE Supsect TESTS AND 
THE ComposiTe Test, Form A (4) Pups) 


Subject Coefficient of 
" . Reliability 
Arithmetic 806 
American History...... See Satan t .812 
Indiana History e.. 24 690 
Civics... . ; 823 
Geography ; : 823 
Language......:.. : ~ .7£8 
Reading..... “: at 868 
Physiology... .. ; ' 711 
Spelling 853 
Composite. . . .910 


While the reliability coefficients were not high for all of the sub- 
ject tests, the reliability coefficient for the composite test wes high 








% The same practice was followed with Forms C and D. The writer does not 
justify this practice except upon the grounds of expediency. 

18 Ruch, G. M., and Stoddard, George D., Tests and Measurements in High School 
Instruction, p. 56. World Book Company, 1927. Kelley, Truman L., Interpretation of 
Educational Measurements, p. 211. World Book Company, 1927. Symonds, Percival M., 
Measurement in Secondary Education, p. 299. The Macmillan Company, 1927. 
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enough that it seemed it might be used satisfactorily as a measure of 
promotion.” 


Weighting the Subjects of the Composite Test 


In a composite test, or a test made up of several subject tests in 
which the total score is used as a single measure, it is necessary to 
determine the relative weight to be given to each test. For example, 
in Form A the number of items in the subject tests varied from 30 
problems in arithmetic to 116 items in language. If each were to be 
scored on the number of items answered correctly, language could easily 
have more weight than arithmetic. To give each of these tests the same 
weighting, it would be necessary to divide each score by its standard 
deviation and then the resulting scores added together would give the 
pupils’ total score in which each subject had equal weight. 

There are several methods of determining the weight that should be 
given to the various subjects in such a test. One method is that of 
weighting each subject according to its importance in the curriculum 
as determined by the per cent of time given to it in that part of the 
curriculum over which the test extends. Another is that of determining 
the relative weight to be given to any subject by its social value as 
determined by pooled judgments. Another method is that of determin- 
ing the relative propaedeutic value of that subject for grades or work 
ahead. 

Three statistical measures should be given consideration in the best 
possible weighting of each test in a composite test. These are (a) the 
standard deviation of the scores, (b) the reliability of the test, (c) the 
independence of each test of all other tests. Each test should be 
weighted inversely as its standard deviation, directly as its reliability, 
and inversely as its correlation with other tests of the composite. 

The method used in weighting the various subject tests in this 
composite test included, to some degree, all the methods suggested above, 
but it did not include any one of these specifically and accurately. 

The following letter was sent to the county superintendent in each 
county. Three blanks were sent with each letter. 


“Dear Superintendent: 

“In developing the Indiana Composite Achievement Test, nine school 
subjects are used. They are: arithmetic, reading, language, American 
history, Indiana history, geography, civics, spelling, and physiology. 
These nine subjects may not have equal value in so far as promotion 
from the 8th grade is concerned. Therefore, it is necessary to get com- 
bined judgment from schocl people on the relative importance of the 
various subjects measured in this test. 

“For this purpose you will find attached three blanks. . Will you 
fill out one yourself and have a good 8th grade teacher and a good con- 
solidated high school principal each fill out one. Return all of these as 
soon as possible. Do not confer with each other before filling out the 
blank, as we want separate judgments. 

“Yours truly, 
” 





"This cannot be determined alone by the cocfficient of reliability. However, in 
view of the fact that the class used was only a one-year class (8A), it seemed that 
the correlation was sufficiently high to obtain an accurate measure. This proved to be 
true, as will be shown later in further discussions of reiiability. 
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EIGHTH GRADE PROMOTION 


Relative Importance of the Various Subjects Covered in the Indiana 
Composite Achievement Test 


Directions: Below are listed the nine subjects included in the In- 
diana Composite Achievement Test for promotion of 8th grade pupils. 
Will vou distribute 225 points to these nine subjects according to your 
opinion of their relative importance in determining 8th grade promotion? 
A good way to do this is to pick out that subject or those subjects which 
vou considér of average importance and give it or them 25 points each. 
Then think of the most important subjects and the least important sub- 
jects and rate them accordingly, giving more points to subjects of greater 
importance or value and less points to subjects of lesser importance. 
Give each subject some value. 


Subject Value in points 
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Of the 270 possible replies, 228 were received. The values given 
on each of these blanks for each subject were added together and the 
mean value determired. This value is given in Table IX. 


TABLE IX.—Mein Vatue Given to Eacu Supsect In THE DISTRIBUTION 
oF 225 Points AMONG NINE SuBJECTS BY TEACHERS AND ADMINISTRATORS 


Subjects Mean Rating 
Arithmetic ier , 32.0 
American History.. .. ao ee 25.5 
Indiana History ase 16.5 
Civies.. j + 21.0 
Geography ' 21.0 
Language... . bey : 30.0 
Reading tah oy : 37.0 
Physiology... .. , $5 15.0 
Spelling...... - 21.6 


These relative values given to the various subjects of the combined 
test, called the pedagogical weighting, give a weight for each subject as 
determined by pooled judgments. By the formula Weighted Score = 

Pedagogical Weight 
Raw Score X , it is possible to determine by 
Raw Standard Deviation 


what multiplier each raw score should be multiplied to obtain a com- 
posite score. The following work is given as an example: 
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TABLE X.—Tue Weicut ror Eacu Sussecr in THE INDIANA COMPOSITE 
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| , 

' Pedagogical) o of Raw pe SL Weight?° 

Subject | Weight Score |¢(RawScore)| — Ysed 

PE an ee ee 32.0 5.0 6.4 3 
American History........| 25.5 12.5 2.0 1 
Indiana History..........| 16.5 6.5 2.5 1 
3 Se Ca? Pe ee 21.0 6.1 3.4 2 
Geography...... ba ee tae 21.0 11.14 1.9 1 
BI oo ire Sdn Hans 30.0 14.9 2.0 1 
I ein 3 37.0 14.0 2.7— 2 
i 15.0 5.5 2.8 1 
| are 21.6 | 10.4 2.1— 1 

| 








This method of finding the weight did not seemingly take into ac- 
count the coefficient of reliability of the subject tests. However, since 
reading had the highest reliability coefficient, the weighting given it 
was slightly above the final figure obtained. Reading was given a 
weighting of two, altho by the method used it would only have been 
given a weighting of one. 


Norms 


Three methods were used in determining grade norms for use with 
these tests. (a) It was first thought that norms might be calculated 
from the try-out form upen the basis of the score on items used in the 
try-out form that were used in the final forms of the test. This was 
tried but the norms thus arrived at were low both for subject and for 
total composite scores, when checked against the norms found from the 
use of the printed forms of the test. (b) County superintendents were 
asked to send in the distribution of their scores after the final printed 
forms were given. After more than a thousand cases had been received, 
a distribution was made from which norms were established. The criti- 
cism of this method is that the norms which are needed by the various 
counties at once must of necessity be delayed several days. (c) The 
best method used in this study was to have representative counties give 
the tests a few days ahead of the regular date for giving them. The 
papers were then sent to the Bureau of Codperative Research where 
they were graded and norms were arrived at. These norms were later 
compared with the distributions sent in by the various counties. In 
1928 this latter method was used and 1,080 papers were scored, distri- 
butions made, and norm sheets sent out in one day by the Bureau of 
Codperative Research. These norm sheets reached all superintendents 
in plenty of time for their use. The grade norms are here given for 
Forms A and C. 
~~ #9 Due to a clerical error, the weighting given language and geography in Forms 


A and B was two instead of one, as it should have been. This is further discussed 
in a later chapter. 
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TABLE XII.—Grapve*” Norms on InpiaAna Composire ACHIEVEMENT TEsT, 
Form C2 » APRIL, 1928 dh oemnen 6, 7, AND sit 
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Percentile |—— |__| ~ 
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| | cpt 
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Summary 


1. The original items, from which the items to be used in the 
Indiana Composite Achievement Test were selected, were developed from ° 
an analysis of the (a) state course of study, (b) state adopted text- 
bocks, (c) previous state essay type examinations. 

2. Objectives of the various subjects were considered in the vali- 
dation of items. 

3. The items were (a) first selected by pooled judgments of com- 
mittees working on the problem, (6) further validated by groups of 
seventh and eighth grade teachers, (c) criticized by examiners giving 
the try-out forms and pupils taking these forms. 

4. The items were turned into new-type questions. 

5. The new-type questions were divided into equated groups by the 
committees in such a way that each group would cover approximately 
the same phases of the subject. 

6. The new-type questions were mimeographed and given in this 
form to pupils who had finished the eighth grade. 

7. Four principles were followed in the organization and admin- 
istration of the test: (a) the responsibility of the examiner was re- 
duced to a minimum, (b) the pupil was given an opportunity to become 
acquainted with the form of the new-type examination before taking the 
test, (c) the type of questions requiring correction in scoring was kept 

6th grade based on 440 cases. 7th grade based on 320 cases. 8th grade based on 
1,080 cases. 
The norms are applicable to Form D, which is an equated form. 
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at a minimum, (d) the examination was given in an order best adapted 
to the pupils’ previous work. 

8. Two methods were used for obtaining equated groups taking 
each of the two try-out forms: (a) every other pupil took the same 
form, (6) all pupils took one form at one date and the other form a 
few days later. 

9. The best method of determining time limits from the try-out 
forms was found to be that of having pupils “mark” in the margin of 
the test at given intervals. 

10. The tests were further selected by the method of per cent 
passing of “poor” (lowest 10 per cent of class) and “good” (best 10 per 
cent) groups. Items were thrown out that were not passed by a larger 
per cent of “good” than of “poor” pupils. 

11.. All the remaining items were considered as one group and 
' divided into two equated forms. 

12. The items or questions were arranged in order of increasing 
difficulty, upon the basis of per cent passing, within groups of like 
types of questions. The groups were arranged in order of increasing 
difficulty upon the basis of average per cent passing each group. 

13. The reliability of the subject tests and the total composite 
test was determined by giving the tests to a small number of pupils then 
in the 8A grade. 

14. The reliability of the composite test was high enough to in- 
sure satisfactory uce of the measure in promotion of pupils. 

15. The weighting of the various subjects of the composite test 
was determined by (a) pedagogical weight, and (lt) sigma of the dis- 
tribution. 

16. The best method of establishing grade norms proved to be that 
in which selected counties gave the tests, these being scored and dis- 
tributions formed from them. 


CHAPTER III 
THE FREQUENCY DISTRIBUTION 


THE frequency distribution of a standard test, when used with a 
large number of cases, is expected to conform to a normal distribution. 
A study was made of the frequency distribution of Forms A, B, and C 
of the Indiana Composite Achievement Test. The 3,458 pupils’ scores 
used in the frequency distribution for Form A were those from tests 
given in April, 1927. The 406 pupils’ scores used for Form B were 
from tests given approximately one month later. The 1,080 pupils’ 
scores used for Forms C; and C: were from tests given in April, 1928. 
The distributions are giver in per cents so that they may be more easily 
compared. Two groups of counties, picked as representative of the state 
as a whole, made up the distributions of Forms C; and C2. These two 
distributions are given to show the reliability of the distribution of 
Form C; which was used as a basis for the norm. 


TABLE XIII.—Per Cent Distrisution or Totrat Compositre Score oF 
INDIANA ComposITE ACHIEVEMENT Test, Forms A, B, anp C! 


























Per Cent Per Cent 
Score Score 
Form A Form B Form C, Form C, 
975-999...... 660-679 
950-974 640-659 
925-949 1 620-639 & 
900-924...... .2 600-619 Py 4 
875-899 2 580-599 5 9 
850-874 6 2 560-579 8 1.5 
825-849 8 7 540-559 2.2 3.0 
800-824...... 1.3 1.2 520-539 3.8 5.7 
775-799...... 2.5 | 500-519 4.7 6.0 
750-774 2.7 4.2 480-499 7.0 6.2 
725-749...... 5.0 4.7 460-479 7.5 7.0 
700-724...... 5.7 7.4 440-459 9.0 8.1 
675-699...... 6.9 6.7 420-439 9.0 9.4 
650-674 6.9 10.1 400-419 9.1 8.4 
625-649...... 8.8 11.8 380-399 9.7 9.2 
600-624...... 9.0 10.3 360-379 7.0 8.0 
575-599...... 8.7 7.6 340-359 6.7 6.8 
550-574 7.9 7.4 320-339 5.8 6.4 
. 525-549 8.1 9.6 300-319 5.4 4.4 
500-524...... 6.5 5.4 280-299 4.4 2.5 
475-499. . 5.4 4.4 260-279 3.1 2.1 
450-474. . 4.2 3.0 240-259 1.6 1.7 
425-449...... 3.5 Pe 220-239 9 1.2 
400-424...... 2.1 1.7 200-219 .2 a 
375-399 1.5 ;) 180-199 6 - 
350-374...... 6 a 160-179 2 a 
325-349 ms) 2 140-159 2 an 
300-324 3 1 120-139 
275-299 2 100-119 
250-274...... y 
225-249...... a 
200-224...... 1 














2The weight given to subjects in Form C was different from that of Forms A and B. 
This accounts for the difference in total score. 

Number of cases: Form A, 3,458; Form B, 106; Form Ci, 1,080; Form Cz, 1,877. 

6—45607 (31) 
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Per Cents of 
Total Cases 








a¢ 3¢ 
Scores 
Figure I. Graphic Representation of the Distribution of Total Scores. 


of the Indiana Composite Achievement Test, Form A, Com- 
pared to a Normal Distribution Curve (3,458 cases) 
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Per Cents of 
Total Cases 








Scores 


Figure II. Graphic Representation of the Distribution of Total Scores, 
Indiana Composite Achievement Test, Form B, Compared to 
a Normal Distribution Curve (406 cases) 
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Figure III. Graphic Representation of the Distribution of Total Scores, 
Indiana Composite Achievement Test, Form C, Compared 
to a Normal Distribution Curve (1,080 cases) 


A measure of skewness is a measure of how far a given frequency 
distribution departs from a symmetrical or normal distribution. A meas- 


ure of this skewness is obtained by the use of the formula sp- 2(M—MO). 
fos 


This skewness for the frequency distribution of the total composite 
scores of Forms A, B, and C is given in Table XIV. 


TABLE XIV.—Skewness or Frequency DistrisutTion or ToraL ComposiTe 
Scores, InpIANA ComposiTE ACHIEVEMENT Test, Forms A, B, anp C 








| 
Form Skewness 





eye Pee —.M 
—.13 


—.14 


Qm> 








All of the distributions are skewed slightly negatively or upward. 
The degree of skewness is very small. Acccrding to the measure used, 
the frequency distributions of these three forms approach closely a nor- 
mal distribution curve. 
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The distributions of the spelling scores were quite unsatisfactory. 
The data are given for the spelling tests of the three forms, A, B, and C, 
in Table XV. 


TABLE XV.—Per Cent Distrisvution or Scores In Spertine, INDIANA Com- 
POSITE ACHIEVEMENT Test, Forms A, B, anp C? 
































Spelling—Distribution of Scores 
Number of Pupils 
Score 
Form A | FormB Form C 
GES co dns caste nc kaka cere | 4.3 5.9 5.3 
ah ds wire WS WEE oo eo deh ok eee cas 6.6 9.9 6.2 
SP aie vise ee hg Pa dae eee es 8.2 11.3 6.9 
ARREST EES Vee 8.9 7.4 7.9 
eh 5 dk gate cs Vooe exe eee eer des 8.4 8.6 6.4 
ME. na dein icoe we ue Modes boat ey 8.9 8.1 tae 
Ee APO ee o. See 8.0 5.9 7.4 
RR ES 2 acm ah Spe 6.7 10.3 6.6 
EE ay Pere Cree; te A 6.7 5.7 5.7 
OS BRS Ce en Oe eee, * 4.4 6.4 4.9 
BG ino on ace. va 55S Sele De ac Dwake §.2 5.4 5.7 
| SE ee: See epee 4.1 3.2 5.8 
ih: Sl ae 65 ats 6d: 0k So See ane ted 4.0 2.0 5.1 
- EPIRA Sree Sy Pee eae & 3:3 2.0 3.9 
RE ae + Somer rnc, <r 2.7 2.2 3.1 
RR ah Spal rane Re 2.0 1.7 2.0 
| ESS Settled feces 1.7 1.2 2.6 
RS aera ashe Os 1.3 BS oe 
RRS eer ter SbF ae 1.1 Bg 1.5 
ERAN 4 Sree es Spy Seen 9 2 1.4 
SE fos Wiiare | <7 te 1.3 2 ) 
We MR b's als buenas cache « iS dia os wee 6 8 
ag OS eee © ree 3 5 1.0 
Far Ei = se P mee np er 3 3 
PEGE 5 nbh-e'e s nie cn Dusian tre wi Baars 1 7 
GA oak le Oates Oa ohn asad oes uae S| 
NN 5 ds eo Sewer ects WORKS eee 36.9 39.4 36.2 
Ss 6 ciks cds Resp ad eee os i 34.7 37.9 34.5 
I sinc ds + a0 db xwale de Paine c tel 10.8 8.92 10.9 





The distribution of spelling scores given in Table XVI are shown 
in Figures 4, 5, and 6, as compared to a normal distribution curve. 


? Number of cases: Form A, 3,458; Form B, 406; Form C, 1,080. 
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Per Cents of 
Total Cases 








Scores 


Figure IV. Graphic Representation of the Distribution of Spelling 
Scores, Indiana Composite Achievement Test, Form A, 
Compared to a Normal Distribution Curve (3,458 cases) 
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Per Cents of 
Total Cases 























Scores 


FiguRE V. Graphic Representation of the Distribution of Spelling 
Scores, Indiana Composite Achievement Test, Form B, 
Compared to a Normal Distribution Curve (406 cases) 
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FicureE VI. Graphic Representation of the Distribution of Spelling 
Scores, Indiana Composite Achievement Test, Form C, 
Compared to a Normal Distribution Curve (1,080 cases) 


The skewness of all three forms is negative,'or to the right. The 
degree of skewness is marked. The degree of skewness for Form A is 
—.61; for Form B, —.50; and for Form C, —.47. It is apparent from 
this measure that the spelling test does not meet all the requirements 
of a good test. 

The words used in these forms of this test were selected by two 
different methods of random sampling. The tests given at two different 
years give practically equivalent scores. If these words used in the 
various forms are a fair and reliable sampling of the words of the 
seventh and eighth grade spelling list, the distribution becomes signifi- 
cant as an efficiency measure of the success of the schools of the state 
in the teaching of spelling. 

The skewness of all the frequency distributions of the subject tests 
and the total composite score of Form C are given in Table XVI. 


A COMPOSITE ACHIEVEMENT TEST 39 


TABLE XVI.—Decree or SKEWNESS IN THE FREQUENCY DISTRIBUTION OF 
THE Supsect Tests AND ComposiTe Tota or Form C (1080 Casgs) 























| Measures 
Subject | 
Mean | Median Sigma | Skewness 
| 
Dag. i soso cs ws od ruby | 50.5 52.0 16.9 —.27 
American History... ... ha d 39.7 38.8 12.7 +.21 
faciene ERIS TR | 27.7 27.8 6.8 —.04 
Yivies...... in os hes ae ata 34.0 33.4 6.05 + .30 
Geography......... | 61.6 62.3 | 14.8 +.14 
La Ss Grgdvaas «2s bends | 67.3 67.6 | 15.2 —.06 
Reading. ...... Pp eer ee 55.3 55.1 16.0 + .04 
oS Se Eee e “se 37.4 37.9 25 | —.28 
a Sa Te ies AeA > SITES | 34.5 36.2 10.9 | —.47 
RII 3 << 5:0. c's Bice hance SRG 404.0 408 .0 8.0 | —.14 








From the data given in Table XVI the skewness of the. various 
subject tests in Form C tends to equalize that of the others. From the 
results of this measure at least four of the subjects—Indiana history, 
geography, language, and reading—approach closely a normal frequency 
distribution. 

Summary 


1. The frequency distributions of the total composite score of the 
three forms, A, B, and C, approach a normal distribution. 
2. The skewness of the distributions of the composite scores, Forms 


A, B, and C, as determined by the formula Sk = 5(M—Mé), is relatively 
c 


small. The distributions are all skewed slightly negatively. 

3. The spelling test in Forms A, B, and C is skewed to a marked 
degree. 

4. A study of the skewness of the other subjects in the composite 


tests of the three forms did not show any subject as markedly skewed 
as that of spelling. 








CHAPTER IV 


FURTHER STUDY OF THE VALIDITY, RELIABILITY, 
AND WEIGHTING OF THE INDIANA COM- 
POSITE ACHIEVEMENT TEST 


AFTER a large number’ of the tests had been given, it seemed 
necessary to make a more comprehensive study of many phases of the 
validity and reliability of the tests as well as the method of weighting 
subject tests in the total composite score. 


Further Study of Validity 


In the original validation of the items, five methods were used: 
(a) analysis of course of study, (b) analysis of textbooks, (c) analysis 
of examination questions, (d) pooled judgments of teachers and adminis- 
trators, (e) per cent of “poor” and “good” pupils passing each item. 

A further study of the validity was made by (a) pooled judgments 
of a more representative group of competent teachers, (b) rise in per 
cent passing each item at successive grade levels, (c) correlation of the 
tests with teachers’ marks and correlation with other standardized tests, 
(d) per cent of “good” and “poor” pupils passing each item. 


Pooled Judgments of Competent Teachers 


To obtain further pooled judgments on each of the 2,546 items 
found in all the four forms, the county superintendents of schools of 
each county in the state were asked to suggest two seventh and eighth 
grade teachers or eighth grade teachers that they considered the best 
teachers in the county. A letter was then sent to each of these teachers 
asking her to codperate. Since the task was one requiring much time, 
it was suggested that they not indicate that they would codperate unless 
they could each give at least eight hours of working time to the project. 
Eighty-seven teachers indicated a willingness to codperate. "A copy of 
each of the four forms was mailed to each teacher with the following 
instructions. 

“Evaluate each item in each of the forms of the Indiana Composite 
Achievement Test as to its fitness for testing achievement of seventh 
and eighth grade pupils. Try to keep in mind what you as a teacher 
are attempting to achieve in your teaching. 


“Evaluate the items of this test by number according to the follow- 
ing plan: 


“If the item is entirely satisfactory (excellent), rate it...... 1 

“If the item is fairly satisfactory (good), rate it........... 2 

“If the item is objectionable (poor, should be eliminated), 
rate it 


ee 


“Write the number representing your rating to the left of the item 
in the test booklet. 


1 Approximately 24,000 copies of Form A, 10,000 of Form B, 25,000 of Form C, and 
6,000 of Form D of this test were used in 1927 and 1928. « 


(40) 
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“Enclosed find stamped and addressed envelope for returning the 
test material after you have marked your ratings. If you will make the 
ratings and return the test at once it will be appreciated. Thank you 
for your codperation.” 


TABLE XVII.—Mean Per Cent or Att Irems Markep “1,” “2,” or “3,” 
ror Eacu Supsect py Forms 





Subjects 













































































Form | Arithmetic | American History | Indiana History 
it tide Bae eae Mee os ee eS 
Wet Rix ES ead 
A .| 92.0] 5.6 | 2.4|80.9| 8.1) 2.0] 844/115] 3.8 
RRR RE ES | 92.4) 5.3| 2.3| 86.3) 10.9) 2.6) 82.8/ 11.4] 5.8 
C .-| 92.1 | 6.3) 1.5 | 87.2) 9.8] 2.3) 84.3] 11.9) 3.7 
iiss oh xotsteu | 92.1] 5.8] 1.9] 87.1] 9.8] 3.2] 81.2 14.0 | 5.2 
' | | ' | | | 
Subjects 
Form Civies Geography | Language 
RGR. wR eS oe R, 2 3 
a 7 ee 
A ..| 85.1 | 12.2 | 2.8 | 91.5 | 3.1 1.0/ 93.3) 5.6] .9 
B | 87.7| 8.5 | 3.7|93.7| 5.5| 12/944] 46| 9 
RSS: | 88.3} 9.8] 1.3/92.7| 64] 9/965] 2.5] 1.0 
ic. Welees | 85.4 | 11.0| 3.7) 91.0| 7.4] 1.6 | 924.4|/ 46] 5 
| Subjects 
Form Reading Physiology | Spelling 
ED ead dine Bd SE ved A 
et Hote tea PGR pire REP MES PERE. eek: 
Rica ...,91.5| 6.0] 2.4| 92.9] 6.2 | 9 | 92.2 | 6.3| 1.0 
eo ...| 90.3} 5.8] 4.2} 92.2] 6.0] 2.0] 94.0] 4.2] 1.8 
C.............../98.8] 5.5] 5193.2] 5.5] 1.2] 96:3) 2.4] 1.0 
D... | 92.4 5.7) 1.1) 91.0) 6.9) 2.3/ 95.0 | 3.2] 1.6 


The average rating given all the items on all forms was as follows: 
a rating of “1,” 90.8 per cent; a rating of “2,” 7.2 per cent; and a 
rating of “3,” 1.9 per cent. These results may be interpreted as follows: 
of all the 2,456 items found in the tests, 90.8 per cent of them were 
considered “1,” or “excellent,” by these teachers; 7.2 per cent were con- 
sidered “2,” or “good”; and 1.9 per cent were considered “3,” or not 
worthy to be used in the test. 

From the averages given it seems that the items or questions on 
the various forms of the test were on the whole quite satisfactory. 








42 BULLETIN OF THE SCHOOL OF EDUCATION 


A study was then made of the specific items marked as “poor” 
by as many as 10 per cent of the teachers judging them. 


TABLE XVIII.—Noumser or Irems anv Per Cent or Totat NuMBER OF 































































































| Subject 
Form Arithmetic American History | Indiana History 
\Number of} Per |Numberofj Per |Numberof| Per 
Items Cent Items Cent Items Cent 
© EE ae aia oe a ee 1 gs 1 1.3 9 | 18 
MDa cetces on vie 3 | 10.0 3 4.0 4 8 
Cc. ao ae 2 2.7 ee 6 
a 8 YAO Le 1 | 3.3 4 5.3 9 18 
Subjects 
Form Civics Language Reading 
Number of} Per |Numberof| Per |Number of Per 
Items Cent Items Cent Items | Cent 
TES Ae iy ee 0 0 0 | 0 
a ee 2 | 4 0 0 1 | 1 
a eee es | 2 | 4 2 1.7 0 0 
ing ib ear iy ee 0 0 1 | 1 
Subjects 
Form Geography Physiology Spelling 
Number of} Per |Numberof| Per |Numberof| Per 
| Items | Cent Items Cent Items Cent 
a See ot Ty) cur: a." oe | 
| 
P Serie ry o |} 0 1 2 Dl 2 
eee 1 1.1 1 2 1 2 
MES 1 1.1 1 2 1 2 
— | 0 0 | 3 6 0 | 0 














Of the 2,456 items found in the four forms of the test, the number 
of items marked by 90 per cent or more of the teachers as “1” or 
“excellent” was 1,628. Of the items, 168 were marked “1” or “excellent” 
by 100 per cent of all the teachers. Of the total items found in the four 
forms, 12 were considered by 20 per cent or more of the teachers not to 
be worthy of a place in the test.2 These items should, no doubt, be 
eliminated. 

? One teacher marked more than one-half of all the items with a “3” and added 


the comment that she was “opposed to this type of testing.’ Altho her judgment 
might be considered biased, it was nevertheless sincere and was included in the results. 
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Correlation with Standard Tests and Teachers’ Marks 


The correlation of these tests with one other composite standardized 
test was carried out for two forms: A, in 1927, and C, in 1928. Since 
only one other composite test was available with which to compare these 
forms, this comparison was the only one made. A comparison was also 
made between a group intelligence test and Form A. 


TABLE XIX.—CorreLaTIon oF THE INDIANA ComposiTe ACHIEVEMENT TEST 
Tora, Score AND CERTAIN Sussect ScorES WITH THE OTHER TESTS 
(Eighth Grade Pupils)? 














Tests Coefficient of 
Correlation 
I. C. A. T., Total, Form A, vs. Stanford Achievement Test, 
ONE, Co ooo iio kk oaks bw hve aaa ces ott eee ea eo eE eee 839 = .019 
I. C. AL T., Total, Form C, vs. Stanford Achievement Test, 
OE. 5. ifn eae s She ove cw bee 6s Saree he ae aan eae .846 = .022 
I. C. A. T., Arithmetic, Form A, vs. Stanford Achievement 
ee ee ee ey Pre ee eine eee 849 = .019 
I. C. A. T., Arithmetic, Form C, vs. Stanford Achievement 
pe rere es ee, oe oe .839 += .022 
I. C. A. T., Reading, Form A, vs. Stanford Achievement Test, 
Re re AR pt BD Pa ergre et: .765 = .029 
I. C. A. T., History, Form A, vs. Stanford Achievement Test, 
ReeS OE OE ROORIOIIG:. 5.55 2 hc tingiccndss Case Obe Ul o5 us awe .709 = .034 
I. C. A. T., Physiology, Form A, vs. Stanford Achievement 
Test, Nature Study and fe RE IE TE TTS ER .622 = .039 
I. C. , Language, Form A, vs. Stanford Achievement 
Test, Language Lo” EER Pies eth ss oe ate te A .4275+ .052 
I. C. , Language, Form C, vs. Stanford’ Achievement 
Test, Lan, Pe CHORE. 6... vo bad passa te ea eke ee cee .388 += .071 
1.C.A.T. " Bpelling, Form A, vs. Stanford Achievement Test, 
Dictation TEL re ne ON Toa pn tot): .823 = .020 
I. C. A. T., Spelling, Form C, vs. Stanford Achievement Test, 
Distatian MOOD... -- ces cc vB ad a LE .849 = .021 
I. C. A. T., Total, Form A, vs. Otis Self-Administering Tests 
of Menta: Ability i tas don shies eae ab CaM a Ei it dbace'k la lec iateme .728 = .031 
Stanford Achievement Test, Total, vs. Otis Self-Administer- 
ing Tests of Mental ADRIGV: 3. eC. a, AS .819 = .023 
I; ©. As Ti, etal, wns Baw Se TR. eo ooaikics sin cu ce cates!. AX — .3904+ .058 
Stanford Achievement Test, Total, vs. Age of Pupils. ....... er®: .056 
Otis Self-Administering Tests of Mental Ability, Total, vs. 


pT 8 Re es Pir rc Us oe tis i (bse + .059 








It would seem that the two tests (Indiana Composite Achievement 
Test and Stanford Achievement Test) are comparable and that, accept- 
ing the Stanford Achievement Test as a criterion, the forms of the 
Indiana Composite Achievement Test compared with it do not show as 
high a coefficient of correlation as should be expected. It is of course 
true that the Indiana Composite Achievement Test forms here compared 
contain subjects not included in the Stanford Achievement Test. It is 
likely, however, that this does not account for the fact that the corre- 
lation is no higher than .846 + .022 for Form C, 


‘Total cases for Form A, 97; for Form C, 69, 
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Four of the subject comparisons are worthy of consideration as a 
measure of validity of the test. The correlations between the scores 
made by the same pupils in arithmetic, reading, language, and spelling 
on the two tests are fairly comparable. The correlation of three— 
arithmetic, reading, and spelling—are approximately as high as those 
ordinarily found between standard tests in single subjects. The corre- 
lation in language is disappointingly low. 

The Stanford Achievement Test shows a higher correlation with 
the Otis Self-Administering Tests of Mental Ability than does the 
Indiana Composite Achievement Test form compared. 

Two forms, A and C of the Indiana Composite Achievement Test, 
were also further validated by a comparison between the test scores 
made by pupils and the final marks given by teachers who had not 
seen the test scores. The marks were made out by the teacher shortly 
after the test was given. In one case this was also done using the 
scores of the Stanford Achievement Test, Form A, in the same manner. 
These results are given in Table XX. 


TABLE XX.—CorRELATIONS BETWEEN TeEsT Scores AND TEACHERS’ MARKS 
(Eighth Grade Pupils)‘ 


Test | Coefficient of 

Correlation 

I. C. A. T., Total, Form A, vs. Average Marks ; 7318+ .031 
Stanford Achievement Test, Total, Form A, vs. Average 

Marks.... 5a ; 676 = .037 
Otis Self-Administering Tests of Mental Ability, Interme-| 

diate, vs. Average Marks : mas OS fe ee .518 = .050 
I. C. A. T., Total, Form C, vs. Average Marks. eee Ue le 
I. C. A. T., Arithmetic, Form A, vs. Arithmetic Marks... . .724 = .032 
I. C. A.J., Reading, vs. Reading Marks ‘..-| .584 * .041 
I. C. A. T., American History, vs. American History Marks..| .717 + .033 
I. C. A. T., Language, vs. Language Marks..... : 5% .731 = .031 
Stanford Achievement Test, Arithmetic, vs. Arithmetic} 

Marks.. PPT CUNT Tal WA o  bett 
Stanford Achievement Test, Reading, vs. Reading Marks...| .578 = .045 
Stanford Achievement Test, History and Literature, vs. 

American History Marks. Ap .427 + .054 
Stanford Achievement Test, Language, vs. Language Marks 484 = .052 


The usual correlation found between teachers’ marks and test scores 
is not high. The correlations given here for the Indiana Composite 
Achievement Test, both for the total score and the single subjects, are 
approximately as high as those found for other tests. Henmon’ found 
the average correlation between history marks and history test scores 
to be .50. The correlation between teachers’ marks and scores on the 
Brown-Woody Civics Test is given as .65 by the authors.’ In a like 

*Indiana Composite Achievement Test, Form A, 97 cases; Form C, 69 cases; Stan- 
ford Achievement Test, Form A, 97 cases; Otis Self-Administering Tests of Mental 
Ability (Intermediate), 97 cascs. 

5Henmon, V. A. C. “Some Limitations of Educational Tests,’ Journal of Educa- 
tional. Research, 7:185-98, March, 1923. 
* Manual, Brown-Woody Civics Test. World Book Company, 1926. 
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manner a correlation of .664 + .037 between teachers’ marks and test 
scores of the Sangren-Woody Reading Test is reported by the authors.’ 
Flemming* found a correlation of .659 between Stanford Achievement 
Test scores and average of teachers’ marks in the junior high school. 


Increase in Per Cent Passing Items at Different Grade Levels 


An effective method of validating test items is to determine rise in 
per cent of successes from grade to grade. This is primarily a measure 
of the discriminative power of each item. Items that show a rise in 
per cent passing from grade to grade may be said to discriminate be- 
tween those grades. If the rise is rapid the discriminative power is 
great. If the item shows no rise from one grade to another, it shows 
no discrimination. A test made up of items all of this latter type 
would give grade norms for one grade the same as another. Items that 
show a decrease from one grade to the next higher grade tend to negate 
discrimination in successive gradés. 

A study was made of the items of Forms B, C, and D with different 
grades. One hundred cases of Forms B and C were selected from 
distributions of a larger number of cases. The one hundred cases were 
representative of the larger distribution of cases.’ 


_ TABLE XXI.—Per Cent or Irems sy Sussects, THAT SHow A Rise in Successes 
THRU GRADES Six, SEVEN, AND EIGHT 











Form Subject | Per Cent 

xis, eR _......| Arithmetic | 84 
RES! a ke ...¢......| Arithmetic 97 
Dvank cit ceas ides sins ett cea | American History | 89 
Qa wa kins i pik ntahbreviaksaented American History 85 
We oo kis wo ac deine ese gee Indiana History 62 
Cincs- ae eae ae ites ....| Indiana History 98 
es ee Sh acy so oa ee wap | Civics 76 
Pil-iGioudh ous dalle os Bacio ah aeee Civies 88 
When hs 5 ake kta doth ons tetas Geography 75 
Reo Rios twin no hosts Cand MERE | Geography 63 
Ce Oe in ES Oe | Language 87 
Wat daeiecne cece eae ae. Seca Language 85 
Bok icchyiss eag- ceases: | Reading 79 
8 OR Te Ore ee age Rt. Reading 92 
eo te i, ek bere | Physiology 7 
ins w cae tthe bee eae ee | Physiology S4 
a. so. Sea | Spelling 74 
Gi hi Suny ee tat als VOR dee Sheed | Spelling 84 


™ Manual, Sangren-Woody Reading Test. World Book Company, 1926. 
§ Flemming, Cecile White. ‘“‘A Detailed Analysis of Achievement in the High School,” 
Teachers College, Columbia University, Contributions to Education No. 196, 1926. 


*The study of Form D was made with 79 cases. The data on that form are not 
included in this study. 














46 BULLETIN OF THE SCHOOL OF EDUCATION 


The results shown in Table XXI indicate that many of the items 
in Form B, and a smaller number in Form C, did not meet the require- 
ment of rise in per cent passing in successive grades. The low per cent 
in geography can be accounted for by the fact that many pupils do not 
have geography in the eighth grade. This is true also of physiology. 
The low percentage in Indiana history for Form B is indicative of that 
test as a whole. It was far from satisfactory in many ways. Much 
effort was made to work out the test in Indiana history in Form C more 
carefully. In all subjects, except possibly geography and physiology, 
those items that do not show a rise in successive grades should be 
eliminated from the test. 

To give some idea of the discriminativeness of items, the following 
examples are taken from Form C. 


TABLE XXII.—Itiustration or Rise or FALL 1n Success GRADES OF ITEMS 
IN LaneuacE Test, Form C 




















Per Cent Passing 
Item No. 

Sixth | Seventh Eighth 

Grade Grade Grade 
ee ee ee eee ees 96 97 98 
ee Ee een 89 93 97 
ae vite xi wid 72 69 24 
epee ape ym oi cag agi 30 54 67 
Se ae a 28 50 75 
ee whaler ae ’ = 4 12 64 











small. degree. Item 39 tends to discriminate in a negative manner 
between grades in which this test is used. Items 47, 63, and 108 are 
good items from the point of view of discrimination. | 

To show the average rise in successes of all the items in each 
subject test in Form C, the mean per cent passing all items is given 
in Table XXIII. 


TABLE XXIII.— Mean Per Cent Passine tHe Various Irems or Eacu 
Supsect Test In THE INDIANA Composire ACHIEVEMENT Test, Form C 
(100 cases for each grade)'° 

















Grades 
Subject r — 
6 7 | 8 
Arithmetic........ ae 25 38 54 
American History....... 27 43 54 
Indiana History. 30 39 57 
SS a hisviiss rds. dhae Dicewh son 45 54 66 
Geography............ ee a) 45 59 62 
ER Keri or weaned’ 2 ss gai sat 37 46 61 
NS EE ee tee 28 42 58 
PROT 5 iit ka edn ds Sead 32 50 63 
DN nis Ba kal Sd xg nahn dedn« 32 50 63 
| 














1° These cases were selected according to the distribution of scores of 440 pupils of 
the sixth grade, 320 of the seventh grade, and 1,080 of the eighth grade. 
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Table XXIII should be read as follows: On the average, 25 per 
cent of the sixth grade, 38 per cent of the seventh grade, and 54 per 
cent of the eighth grade pupils passed all the various items or questions 
in the arithmetic test. This table shows that (a) there is on the average 
an increase in per cent passing in successive grades in all subjects, 
(b) there is a rather small increase in per cent passing in successive 
grades in geography between grades seven and eight, (c) physiology, 
which is comparable to geography in that it is often not taught in the 
eighth grade, shows a much greater average increase from grade seven 
to grade eight than does geography. 

To validate items by the measure. of increase in per cent passing 
in successive grades assumes continuous teaching of those subjects thru- 
out the grades used as a measure. The curriculum in the state in which 
this work was carried out allows much freedom on the grade placement 
of some of the material. While American history does not have to be 
* taught until the seventh grade, it is taught to a marked degree in 
grades five and six. While, in many schools, geography and physiology” 
are discontinued at the end of the seventh grade, others continue work 
in these subjects thru the eighth grade. The formal work in physiology 
is discontinued at least as early, if not earlier, than the formal work in 
geography. Why was it that in physiology a much larger per cent of 
the items showed an increase in successive grades than was the case 
in geography? This was shown to be true in a comparison of items 
(see Table XXII) and in a comparison of the mean per cent passing 
the items of these two subjects in grades seven and eight (see Table 
XXIII). To the writer there seem to be two possible reasons: (a) the 
items used in physiology are more fundamental and lasting in the life 
of the child than those that are used in geography; (b) the instruction 
measured in physiology is actually carried on in grade eight thru the 
instruction in other subjects. The writer feels, from his knowledge of 
the instruction in civics taught primarily in grade eight and general 
science taught in some schools in the same grade, that much instruction 
in physiology is carried on in those schools that do not teach physiology 
in that grade. In geography this is not true to as great a degree. If 
geography is not taught as a subject, instruction in it stops. This 
comparison between geography and physiology suggests a phase of 
curricular study that might be made thru testing. The conclusions 
reached here concerning these two subjects are not necessarily true, 
but at least they suggest the very close relationship between the 
statistical and the curricular studies that are made of the validity of 
the items of a test. 


A Re-Study of Items Passed by “Good” and “Poor” Pupils 


A re-study of the validity of items of one of the four forms was 
made by finding the per cent of “poor” and “good” pupils passing each 
item. The original selection of the items was made by throwing out 
any item that did not show a larger per cent of the “good” than of the 
“poor” pupils passing it. To determine roughly the reliability of this 

™ Some schools do not carry physiology into the seventh grade. 


7—45607 
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procedure, the same kind of a study was made of the test items that 
remained and were used in the printed form. The study was made for 
grades six, seven, and eight. The results for the form studied are given 
in Table XXIV. 


TABLE XXIV.—Per Cent or ITremMs By SuBJECTS AND GRADES IN WHICH 
‘‘Poor”’ Purits Dip as WELL As OR BETTER THAN ‘‘Goop’’ PuPILs 





Grades 

Subjects —_—— ———_—— —— 
6 7 8 

3 2. SR STS Ae Nome romeo 
Arithmetic Pitie 0 10 7 
American History. . 4 4 4 
Indiana History... 4 2 0 
Civies. ... 0 2 2 
Geography. 1 4 2 
Language... . 0 4 5 
Reading.... 5 | 6 3 
Physiology 2 8 0 
Spelling. 0 0 2 


From a study of this form it would seem that, on the whole, the 
procedure was reliable. In arithmetic a sufficiently large per cent of 
items failed to show the expected result to cause some doubt as to the 
reliability of the procedure. The results for the seventh grade did not 
show as satisfactory results as those for the other grades. 


TABLE XXV.—Averace Per Cent or ‘Poor’ AND “Goop’’ Pupits, BY 
Graves, PASSING THE ITEMS IN THE VARIOUS SuBJEcTs oF Form B or THE 
INDIANA CoMPOSITE ACHIEVEMENT TEST 


Grades 





| 
Subject 6 | 7 | 8 





| Poor | Good | Poor | Good | Poor | Good 


Arithmetic. ... .. bes avert Sel al | 21 | 53 | 37 | 80 
American History neat 10 38 | 18 49 | 26 | 83 
Indiana History. ...... Sk ak Se ee Be ae 
Civies Fe Weer me oie ae ee ee 
Geography... errr. 16 | 51 | 26 | 69 44 81 
NS EEE see 18 | 37.| 2B | 4! 30 | 56 
Reading. ... | 19 | 52 | 22 | 58 | 33 | 7 
NGS ees | 20 66 | 33 | 66 | 53 81 
Cs ws. «91497 < or —~ bs 16 | 67 | 31 | 74 | 40 77 
| | | 


Table XXV should be read as follows: in arithmetic in the sixth 
grade an average of 13 per cent of the “poor” pupils passed the items 
and an average of 32 per cent of the “good” pupils passed the items. 
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The original items of Form B were selected on differences in per 
cent passing each item between “poor” and “good” pupils of the begin- 
ning ninth grade. However, the same distinction tends to hold good 
with other grades when the same test is made of the average per cent 
of the “poor” and “good” group passing the items. In an achievement 
test the distinction between “good” and “poor” groups cannot be made 
so great as in intelligence tests, since the other factors of compre- 
hensiveness, sampling, and course of study must enter into considera- 
tion. However, it seems fair to say that, when other necessary qualities 
of a test are present, the wider the distinction in per cent passing the 
items between “good” and “poor” pupils the better the test. 


A Study of the Accuracy of Measurement of the Indiana Composite 
Achievement Test 


To determine whether the reliability of the test as a whole and of 
each subject test was sufficiently high to give accurate results, a more 
detailed study was made of the reliability of each subject test of the 
various forms after they had been printed and used widely. The scores 
of each pupil on the “even” and on the “odd” items of each group of 
questions were correlated and this correlation was then corrected by 
the Spearman (sometimes called the Brown) prophecy formula.” All 
reliability coefficients were as high as, and some were higher than, that 
found in the preliminary work on reliability. The reliability of each 
test after it was given in the printed form is shown in Table XXV. 


TABLE XXVI.—Svussecrt Test anp ToraL Score on Eacu Form oF THE 
InpIANA Composite ACHIEVEMENT Test (Eighth Grade Only. Evens vs. 
_Odds Method with a Caenegenatay 











7. 
| Arith- | American| Indiana | Civics | Geog- 
Form | metic History | History | | raphy 
Gere mana RNS gray 2a 
Pee oe. ere 834 | 845 - 703 | .876 .826 
eee 2 ..| 826 | 892 | 676 | 803 | “848 
| EEE OEE re 924 | 809 | 823 | 888 
ee Si wae es | 828 | .898 | 834 | .745 |  .931 
Form | Language | Reading Physiology} “Spelling | Total 
cupntinitinangptsctainttsiialicdiigtciciasinietigl ————$$___——|—__ 
| | 
Bie hoes See ae Pie leet ae | 948 
Beyer re rn ee 874 .914 | 716 | .832 | .956 
| REP ORER ES. cnre 847 | .941 | 85 | + 1 | .974 
Bi ae 986 


am | 944 | .908 .876 
| | ! 





Two measures taken together determine how consistently a given 
measure will give the same result if used again under similar conditions. 
nm ss Nr 
® The formula used was r = ——————— 
1+(N—1), 
37150 cases for each of Forms A, B, and C; 112 cases for Form D. 
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These two measures are the reliability coefficient and the standard 
deviation of the distribution. These two measures used in the formula, 
P. E. (pupil’s raw score) = .6745¢+/ 1—Ti2, will give the amount of 
error to be expected in a pupil’s score. The probable error of a raw 
score was determined for each subject test’* and the total score on 
each of the forms. These were then compared to the standard deviation 
of the test and the difference in grade norms between grades seven and 
eight. 

These data are given in Tables XXVII, XXVIII, and XXIX. Since 
the number of cases upon which these measures for Form D were de- 
termined was small, the results concerning that form are not given in 
detail. 


4% The score of the test as weighted, if that subject were weighted in the total com- 
posite score, was taken as the raw score. 
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To interpret the probable error of a score, it is necessary to know 
its relation to the sigma of the distribution and the difference between 
grade norms. Since the probable error may be interpreted to mean 
that the chances are even that the error of any given score will not 
exceed that amount, the relation of that amount to the variability of the 
scores of the grade is significant. Suppose that the probable error of 
any pupil’s score was as much as the difference in norms on the test 
from one grade to another. The chances would be even that, should a 
pupil make a score equal to the norm of one grade, he would make a 
score equal to the norm of the grade above or below if he were given 
the test again. Such a measure would be valueless for purposes of 
classification or promotion. As the relation between the probable error 
of a score and the standard deviation approaches zero, the test becomes 
a more accurate measure. The same is true in the relation of the 
probable error of a test score and the difference in grade norms. 

Since the total score is most used in the forms of this test, a dis- 
cussion of the probable error of the total score seems in place. The data 
on the total score only for each form is given in Table XXX. 
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The meaning of the probable error measures shown in Table XXX 
may be interpreted in this manner. For Form A, if a given score is 
made by a pupil, e.g. 608, it can be said that the chances are 1 to 1 
that the true score of that pupil lies between 608—17 and 608+17, or 
between 591 and 625. For Form C, if a given score is made, e.g. 400, 
it can be said that the chances are 1 to 1 that the true score of the 
pupil lies between 400—8.85 and 400+8.85, or approximately between 
891 and 409. The probable error of the pupil’s total score in Form A 
is .155 of the sigma of that distribution and .141 of the difference be- 
tween the mean score of grade seven and grade eight. The probable 
error of a pupil’s total score in Form C is only .109 of the sigma of 
that distribution and .107 of the difference between the seventh and 
eighth grade norms of the test. The other forms may be interprgted in 
a like manner. 

A comparison is given in Table XXXI° between the total composite 
score of the Indiana Composite Achievement Test, Form C, and the 
Stanford Achievement Test, Advanced Examination. The data for the 
Stanford Achievement Test are taken from the Manual of Directions 
of that test. The probable error given for that test is of the estimated 
true score. This measure is usually less than the probable error of the 
raw score of a test. 
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It will be noted that the probable error of a pupil’s total score, 
interpreted in terms of the grade differences between grades seven and 
eight, show that the Indiana Composite Achievement Test, Form C, 
gives a more accurate measure than the Stanford Achievement Test. 
Since the data were worked out on groups of pupils in different sections 
of the country and under different conditions, it is likely that the com- 
parison is not so reliable. It is given here as an illustration of a 
method of comparing the accuracy of tests as a measuring instrument. 


Further Study of Weighting Subjects in the Indiana Composite 
Achievement Test 


As explained in a previous chapter, a rather simple method of 
weighting the subjects in the composite test was used. This method 
took into account primarily the sigma of the distribution of the raw 
score (before the test was weighted) and the pedagogical weighting 
determined by pooled judgments. A further study of this question was 
made to determine whether a more careful method of weighting would 
give better results. To illustrate the two methods of weighting used, 
the data are given for Form C in Tables XXXII and XXXIII. 


TABLE XXXII.—Susortr Mertruop or DetrerRMINING RELATIVE WEIGHT TO BE 
ASSIGNED TO VARIOUS SuBJECTS OF THE INDIANA ComposITE ACHIEVE- 




















MENT TEST 
Measures 
Subjects ; , 

‘ p.w. | F-¥- | Nominal 

, Weight 
DN in do ves avabe kth ods aes 5.64 32.0 5.7 3 
American History................ 12.68 25.5 2.0 1 
ER ree ee Pe 6.83 16.5 2.4 1 
OR RIES ge SES EE Rees Ce ee 6.05 21.0 3.5 2 
I oss ows Roe cok 14.79 21.0 1.4 1 
ab RRS Ein SOREN rE Oe: 15.16 30.0 1.9 1 
ME Sg os ool hue pan oo 15.98 37.0 2.4 1 
aE Fr Ae ee 5.50 15.0 2.7 1 
BS oon pees sce Sues wae aes 10.87 21.6 2.0 1 

















| 
| 





This method of determining weights takes into account the two 
factors of the sigma of the distribution and the pedagogical weight. 
This was the method used on all forms of the Indiana Composite 
Achievement Test. 

A more careful method of weighting of subjects in a composite test 
takes into account: (a) the sigma of the raw score, (b) the coefficient 
of reliability, (c) the pedagogical weight assigned to each subject, and 
(d) the independence of measurement of each subject. One measure 
of the independence of a subject could be obtained by pooled judgments 
of teachers and administrators. In view of the fact that the pedagogical 
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weight was obtained in this manner, it would have been well to have 
used the same type of measure of independence. However, this was not 
available. Another measure of the independence of one subject with 
respect to another is shown by the correlation of the scores of that sub- 
ject test with those of the other subject test. If the correlation is high, 
the independence will be small, since the two measures tend to measure 
the same thing. As the correlation becomes lower, the independence of 
measurement increases. The coefficient of correlation of each of the nine 
subjects with each other was obtained. The average coefficient of corre- 
lation of each subject with each of the other eight subjects was used as a 
measure of its independence. The reciprocal of each coefficient was ob- 
tained. This was multiplied by fifteen to give approximately equal 
numerical value to pedagogical weight and independence. The effect of 
this procedure is to give greater independence value to subjects that 
have lower coefficients of correlation with the other eight subjects and 
vice versa. Table XXXIII gives the detailed work of such a method of 
weighting. 
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An attempt was made to determine the influence of weighting on 
the placement of pupils. To determine this, the coefficient of correlation 
was calculated for three types of weighting. In one case no weighting 
was given any subject, in another case the weightings determined by 
the method shown in Table XXXII were used, and in still another case 
no subject except arithmetic was weighted and it was given a weight 
of three. The same one hundred papers were scored by the three 
methods and the coefficients of correlation determined. These corre- 
lations are given in Table XXXIV. 


TABLE XXXIV.—Coerricients or CORRELATION BETWEEN THE TOTAL SCORE 
or PuPpmits WHEN OBTAINED BY DiFFERENT Metuops or WEIGHTING 
(100 Cases) 











Methods Used | r 
No Weight vs. Weight of Arithmetic Only.................. .923 = .010 
No Weight vs. Weight by Detailed Method ................. 906 = .013 


Weight of Arithmetic Only vs. Weight by Detailed Method. . 934 + .009 








It would appear from the correlations shown in Table XXXIV that 
the more detailed method of weighting did not give widely different 
results from the simpler method of weighting used. 

Promotion from one grade to another is of much importance in 
the use of this test. The prognostic value of the various subject tests 
that make up the composite should have been considered in arriving at 
the weights to be assigned to the various subjects. This material was 
not available at the time of this study. 


Summary 


1. The forms of the Indiana Composite Achievement Test were 
further validated by (a) pooled judgments, (b) rise in per cent passing 
each item at successive grade levels, (c) correlation with other standard 
tests and with teachers’ marks, (d) the per cent of “good” and “poor” 
pupils passing each item. 

2. The pooled judgments of the “best” seventh and eighth grade 
teachers show a high percentage of excellent items and a very small 
percentage of “items that should be eliminated.” 

3. The coefficients of correlation of separate subject tests of the 
Indiana Composite Achievement Test, Forms A and C, with comparable 
tests in the Stanford Achievement Test, Advanced Examination, were 
not in all cases as high as they would be expected to be. 

4. The coefficient of correlation of total scores of the Indiana 
Composite Achievement Test and the Stanford Achievement Test, Inter- 
mediate Examination, was not as high as two composite tests should 
give. 

5. The coefficient of correlation between teachers’ marks and total 
score on the Indiana Composite Achievement Test was higher than that 
found between teachers’ marks and the Stanford Achievement Test. 
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6. The per cent of items that showed an increase in per cent pass- 
ing in successive grades was larger in Form C than in Form B. The 
per cent of items that meet this standard in Form B is 76, in Form C 
it is 86. 

7. The average per cent passing all the items shows an increase 
from grade to grade. The difference is small between seventh and 
eighth grade in geography. 

8. A re-study of the per cent of “poor” and “good” pupils passing 
each item showed that the original validation by this method was reli- 
able. 

9. A measure of the accuracy of measurement of the various sub- 
ject tests and the total composite test, by the measurement of the 
probable error of the score, proved that the tests are highly accurate. 
This was proved by comparison with the standard deviation and the 
difference between grade norms. 

10. The probable error of a score compared to the difference be- 
tween seventh and eighth grade norms is less with the Indiana Com- 
posite Achievement Test, Form C, than with the Stanford Achievement 
Test, Form A, Advanced Examination. 

11. A more detailed method of weighting of the separate subjects 
in the composite test, which took into account (a) the standard deviation 
of the distribution, (b) the pedagogical weighting, (c) the coefficient of 
reliability, and (d) the independence of measurement, gave different 
weighting to the various subject tests than the shorter method that took 
into account only (a) the standard deviation, and (b) the pedagogical 
weighting. 

12. The correlation between the total composite score of the same 
test papers with different weightings was marked. 

















CHAPTER V 


THE INDIANA COMPOSITE ACHIEVEMENT TEST AS A 
MEASURE OF ACHIEVEMENT IN FOUR 
TYPES OF RURAL SCHOOLS 


IN the rural schools of Indiana there are four types of organiza- 
tion: (a) the one-room school with one teacher for eight grades; (b) 
the two- and three-room schools with two or three teachers for the eight 
grades; (c) the consolidated school, usually with one teacher to a grade, 
in which the first eight grades are considered as the elementary school 
(in this study this is called the eight-four plan); and (d) the six-six 
plan of organization, in which the seventh and eighth grades are a part 
of the high school organization and are taught by high school teachers. 

Since a large number of the various types of schools used the 
Indiana Composite Achievement Test, Form A, in 1927, it was thought 
advisable to study the achievement of the eighth grade pupils in these 
types of schools. 

The results from 14 counties, well distributed over the state, in 
which were found each of the four types of schools, were compared. The 
data on the number of schools of each type and the number of pupils 
whose scores were used are given in Table XXXV. 


TABLE XXXV.—NvumpBer or Scuoots, NuMBER oF MANUSCRIPTS, AND 
AveRAGE NuMBER OF MANUSCRIPTS PER SCHOOL FOR THE Four TyYPEs OF 
SCHOOLS OF THIS STuDY 








Average Num- 








| 
Type of School | Number of Number of ber Manu- 
Schools Manuscripts | scripts per 
School 
Eight-four plan schools. ....... | 102 1229 12.05 
Two- and three-room schools... .| 75 413 5.5 
One-room sechools.............. 276 7£3 2.87 
Six-six plan schools. .......... 29 661 22.8 
| 
pT A a 482 SA. ha ete ee 














It will be noted from Table XXXV that the number of schools of 
the six-six type is much smaller than the other types. The number of 
pupils represented in the eight-four types is much greater than that 
found in any other type. The number of pupils per school is much 
larger in the six-six type of school. 

A comparison was first made of the total score on the Indiana 
Composite Achievement Test by types of schools. These results are 
given in Table XXXVI. 
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TABLE XXXVI.—Constants or THE Distriputions or Tora, Scores MapE 
BY Puprts rroM Eacu or Four Types or Scnoois 1n INDIANA. INDIANA 
ComposiTe ACHIEVEMENT Test, Form A, Aprin, 1927 


; : 
Types of Schools Q: Q; | Q |P.E.| Md.| M j{8.D. 








Graded eight-four plan schools.. 559 | 714 | 77.45) 69.9 638 | 635 |103. 
Two- and three-room schools.| 554 | 700 | 77.85) 76.79) 631 | 620 /113. 
| 


-sT100™] 


| 

| 
Six-six plan schools 507 | 660 | 76.5 | 74.7 | 582| 583 |106. 
One-room schools 503 | 656 | 76.6 | 74.5 | 590| 579 /110. 
State-wide, 5,000 pupils, all 


types... . 525 | 715 | 85.0 | 79.0| 615 |......{104 


It will be noted from these results that the achievement of eighth 
grade pupils on the scores of the Indiana Composite Achievement Test, 
Form A, differed in the different type schools. This is further shown 
by Figure VII. 
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Figure VII. Range of the Middle 50 Per Cent of Scores of Four Types 
of Schools on the Indiana Composite Achievement Test 


It becomes apparent from these data that the one-room schools and 
the six-six consolidated schools did not achieve, as measured by this 
test, as well as the other two types of schools studied. The differences 
are significant. The mean of the eight-four schools is 51.7 points higher 
than the six-six schools and 55.9 points higher than the one-room 
schools. In the former this difference is 11.4 times the probable error 
55.9 
“4.9 

It might be expected that the consolidated schools of the eight-four 
type and the two- and three-room schools would achieve better than the 
one-room schools. However, it was surprising to find that the six-six 
plan of consolidated school achieved approximately as poorly as the 
one-room schools. It was then necessary to carry further the com- 
parative study of these schools upon several bases: (a) a comparison 
of those subjects in which the types might be fairly compared because 
of the subjects in which they received instruction, (b) a comparison of 


of difference 
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the textbooks used in these comparable subjects, (c) a comparison of 
the ages of the various groups. It was deemed further advisable to 
study the attitude of teachers in the six-six schools toward their work 
with seventh and eighth grade pupils. This study seemed necessary 
because of the fact that they were taught by high school teachers. 

Due to a difference in organization, the six-six type of school is not 
entirely comparable to the other schools studied. The differences here 
noted are (a) the six-six schools do not have as many subjects in the 
seventh and eighth grades, (b) they give more recitation time to each 
subject, and (c) their teachers are trained for high school work. A 
comparison was made of the achievement of the various type schools in 
those subjects taught in the six-six type schools. Those subjects were 
arithmetic, American history, civics, and language. 


TABLE XXXVII.—Constants oF THE DISTRIBUTIONS OF THE ScorES MADE 
BY Pupris rrom Eacu or Four Types or ScHoois In INDIANA. ARITHMETIC 
Test, INDIANA Composite ACHIEVEMENT Test, Form A, Aprin, 1927 

















Types of Schools | Q, Q; | Q | P. E | Md. | M |S.D. 

Giiay Rigs grt Ke we Ba 
Graded eight-four plan schools} 49.5 | 70.4 10. 45) 7 70) 61.7 | 59.4 | 11.54 
Two- and three-room schools. | 46.95) 70.25) 11.67) 11.34) 59.6 57 .35| 17.0 
One-room schools............. 43.7 | 65.0 | 10.65) 10.86) 55.1 | 54.26) 16.3 
Six-six plan schools. ..| 45.2 | 66.4 | 10.6 | 7.7 | 55.1 | 56.1 | 11.54 


TABLE XXXVIII.—Constants or THE DisTRIBUTION OF THE ScorES MADE BY 
Pupms rrom Eacu or Four Types or ScHooits IN INDIANA. AMERICAN 
History, INDIANA ComposirE ACHIEVEMENT Test, Form A, Aprit, 1927 








Types of Schools | Q | Qs: | Q |P.E.| Md.| M | 8. D. 

Sata a et ee OE ee ee Oe ee ee | | 
| | | | 

Eight-four plan schools.......| 31.1 | 50.5 | 9.7 | 8.83) 39.9 | 40.8 | 13.1 
Two- and three-room schools..| 30.25' 51.3 | 10.53! 9.04! 41.0 | 40.76, 13.4 
One-room schools... 25.5 | 44.3| 9.4| 8.63] 33.6 | 35.5 | 12.8 
Six-six plan schools...........| 27.5 | 45.6 | 9.05) 6.43) 35.75| 37.4 | 9.54 
All types 27.8 | 48.4 | 10.3 | We 


TABLE XXXIX.—Constants or THE DISTRIBUTION OF THE ScoRES MADE BY 
PuptIts FRoM Eacu or Four Types or SCHOOLS IN INDIANA. Crvics, INDIANA 
Composite ACHIEVEMENT Test, Form A, Aprit, 1927 


——S— —— — a = = ——— <= 








Types of Schools Qt @e Tre | P.E | Md. | M |8.D. 
| | 
EKight-four plan schools.......| 49.1 | 64.9 rie Un See Pe 57.6 | 56.6 | 11.45 
Two- and three-room schools. | 48.2 | 64.7 | 8.27| 7.93] 57.66) 56.34) 11.76 
One-room schools..... . . .| 43. | 60.25) 8.45) 7.32) 52.4 | 51.8 | 10.85 
Six-six plan schools. ....... | 45.6 | 62.5 | 8.45) 8.2) 53.5 | 
| t 


53.6 | 12.12 
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TABLE XL.—Constants or DistripuTion ror Eacn Tyre or SoHoon 
Sruprep, Basep upon THE LANGUAGE TEST OF THE INDIANA COMPOSITE 
ACHIEVEMENT Test, Form A, AS ADMINISTERED TO RurRAL ScHoou Pupits 
or INDIANA IN APRIL, 1927 


es ae | Shs 
Types of Schools | Q: | Qs | Q |P.E.| Md.| M |S.D. 





| 

} 
Eight-four plan schools... ... .|/120.6 \158.9 19.15) 19.29/138.25)140.1 | 28.6 
Two- and three-room schools. |114.5 |155.3 | 20.4 | 20.43)133.3 (135.9 | 30.3 
One-room schools...... -.+++-|106.7 [144.4 | 18.85) 18.62)126.2 126.9 | 27.6 
Six-six plan schools. ..........|105.9 |149.8 | 21.95) 21.25,127.7 |128.6 | 31.5 

All types in state-wide survey, 

50s mamenorints..........- 1019.0 1008.6 1.085. OBB oe. 


It seems apparent from the results shown in Tables XXXVI to XL 
that the achievement of the six-six type and one-room type of school, 
as measured by this test, was consistently and significantly lower than 
that of the other two types of schools studied. 

A study was then made of the textbooks used in the six-six type 
of school. At the time this study was made, Indiana had a uniform 
adoption of texts for all schools. Due to the fact that the six-six type 
of school desired some liberty in the choice of texts, it seemed that a 
study should be made to determine whether such practice was then 
taking place. This was necessary in view of the fact that the Indiana 
Composite Achievement Test was validated upon the basis of the state 
course of study and state adopted textbooks. 

The information concerning textbooks was obtained from the 
principal of each of the various six-six schools studied. This informa- 
tion is given in Table XLI. 


TABLE XLI.—Texts Usep 1n Seventa aNnp ErcuTa Grapes or Srx-Srx 
Type ScHoots 


Number of _|Number Using Texts 
Subjects Schools Studied |Different from State 
| Adopted Texts 





Arithmetic. .... ; = 29 5 
American History.... Le 29 0 
Language. Hee 29 

Civics...... ae 29! 3 


It is difficult to say how much effect this variation in texts might 
have had on the variations in scores of the six-six type of schools. It 
would seem likely that the difference was very slight in civics and not 
present in American history. It is doubtful whether the variation in 
the use of texts in the other two subjects would account for the differ- 
ence of the schools as a whole. 


1 There was no state adopted text in civics. The test was validated on four texts. 
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The other types of schools may have used texts that were not state 
adopted. However, in view of the sentiment toward changes in texts 
for the six-six type schools, it is probable that the use of texts other 
than those state adopted was more common with this type of school. 

The age at which pupils finish the eighth grade is of importance in 
studying the comparative achievement of the pupils of the various type 
schools.- Should the age of entrance in school be equal in these various 
type schools, then the age of completion of the eighth grade would be 
somewhat significant as a comparison. The conditions surrounding the 
eight-four and the six-six type of school are so nearly equal that it is 
fair to assume that the age of entrance would be equal. The age of 
the pupil as of April, 1927, was given on each manuscript. Complete 
data on age were available for only 2,377 pupils. These data are, how- 
ever, representative of the 3,096 pupils of the entire study. The results 
are given in Table XLII. 


TABLE XLII.—Averace Ace 1n Montas or Pupits In THE VARIOUS TYPES 
or Scnoots (Age as of April, 1927) 
es Pied 











Number of | Average Standard Probable 
Pupils Age in Deviation Error 


Type of School 
(8th Grade)| Months | in Months | in Months 





Graded eight-four plan | | 


S | 











RES CRI ee 921 171.90 | 12.70 8. 
Two- and_ three-room 
rece “Reape ee 293 173.10 | 11.17 7.53 
One-room schools........ 717 175.83 | 12.70 8.56 
Six-six plan schools...... | 446 172.60 | 11.52 | 7.58 
perenne ee 
ty er 173.4 months} 








Table XLIT shows that the average age of pupils of the eight-four 
type of school was 3.93 months less than that of the two- and three- 
room schools and .7 of a month less than that of the six-six type schools. 
It will be noted that the difference between the average age of the 
one-room schools and the eight-four consolidated schools was approxi- 
mately as much as one-half of the rural school year of eight months. 
This difference would be significant if the age of entrance and attend- 
ance conditions surrounding the two types of schools were the same. 
It is apparent, however, that the difference in achievement between the 
six-six schools and the eight-four schools cannot be accounted for in 
difference in age of eighth grade pupils. 

At the time of this study the teachers who taught in the six-six 
type of school in Indiana were required to have three or more years of 


training above high school. They were primarilv ed to teach high 
school. The supervised teaching and special r courses received 
in training were primarily preparation for hign _ol teaching. A 


questionnaire study was made of all teachers in the six-six schools in 
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the state to determine, among other things, their attitude toward teach- 
ing of seventh and eighth grade pupils. At the time the study was made, 
77 per cent of the teachers in the six-six schools had four or more years 
of training beyond the high school. 

There were 338 teachers who taught in both the seventh and eighth 
grades and the upper four years of high schcol work. The teachers 
were asked to indicate in an unsigned questionnaire in which grades 
they preferred to teach,—i.e., in grades seven and eight, or in the upper 
four grades. The results of their answers are given in Table XLIII. 


TABLE XLIII.—Arrirupe or Teachers ON PREFERENCE OF GRADES IN Srx- 
Six Scuoots (Teachers Teach in Both Seventh and Eighth Grades and in 
Four Upper Grades.) 


- 7 
7 
Preference Number Per Cent 
ce ie mr 
of Teachers | of Teackers 


Prefer only upper four grades. . 222 65.69 
Prefer all six grades or grades seven and eight..... 98 28 .99 
Dependent on subject... . ’ 4 1.18 
Do not know.... 14 4.14 

Total... 338 100.00 


Approximately 66 per cent of the teachers who teach in the seventh 
and eighth grades prefer to teach only in the upper four grades of the 
six-six school. There is no difference made in the pay or license require- 
ments for the two groups of grades. Further investigation of the 
reasons for this attitude indicated that training, lack of interest in 
subject-matter of seventh and eighth grades, desirable traits of pupils 
of upper four years, and undesirable traits of pupils of seventh and 
eighth grades were the outstanding reasons for this preference. It 
seems apparent from this study that a large per cent of the teachers of 
seventh and eighth grade pupils were out of sympathy with the work 
of the grades. 

The junior high school, of which the six-six type school is a variant, 
was not organized around the concept of increased achievement of the 
type measured in the Indiana Composite Achievement Test. However, 
it is equally true that the aims of the junior high school must be 
attained if achievement as measured by the Indiana Composite Achieve- 
ment Test is not to be maintained. There is little proof of the attain- 
ment of the six-six schools in Indiana in guidance, exploratory courses, 
adjustment to individual differences, etc. 

There seems to be, in this comparison between the various types 
of schools, no reason that would explain the difference in achievement 
between. the six-six type of school and the eight-four type except the 
teachers and possibly the objectives of the two types of schools. 
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Summary 


1. The comparison of achievement of eighth grade pupils as 
measured by the Indiana Composite Achievement Test, Form A, of four 
types of schools, (a) one-room, (b) two- and three-room, (c) eight-four 
consolidated, and (d) six-six consolidated, showed that the eight-four 
consolidated schools ranked higher than any other type. 

2. The achievement of eighth grade pupils of the one-room and 
the six-six consolidated schools was significantly lower than the achieve- 
ment of pupils of the eight-four consolidated schools. 

3. The differences in achievement between six-six and eight-four 
schools cannot be accounted for by the age of pupils or the textbooks 
used. 

4. A possible cause is found in the type of training and attitude 
of the teachers of the seventh and eighth grades in six-six schools. A 
large per cent of these teachers do not like to teach seventh and eighth 
grade pupils. 








CHAPTER VI 


THE PROGNOSTIC VALUE OF THE INDIANA COM- 
POSITE ACHIEVEMENT TEST 


ALTHO the Indiana Composite Achievement Test was not developed 
primarily as a prognostic test, nevertheless its value as an instrument 
of promotion is to a rather large degree measured by its prognostic 
ability. 

An attempt was made (a) to determine to what extent scores made 
on the Indiana Composite Achievement Test, Form A, could be used to 
predict general success in high school work; (b) to determine whether 
teacher rating of certain character traits would be a valuable aid to 
this prognosis; and (c) to compare the Indiana Composite Achievement 
Test, Form A, with other measures that might be used as prognostic 
measures. 

To determine the prognostic value of the total composite score on 
the Indiana Composite Achievement Test, Form A, the scores of 339 
eighth grade pupils in 1926-27 who were in the ninth grade in con- 
solidated high schools the following year (1927-28) were obtained. The 
measures of high school success used were (a) marks in high school 
subjects, and (b) scores in standard tests in high school subjects. 

The average mark of each pupil in three subjects was used as a 
criterion. The marks in English, algebra, Latin, or biology were used 
as a basis for the average mark. This meant that each pupil’s average 
mark for the year was made up of two semesters’ marks in each of the 
subjects of, English and algebra and two semesters’ marks either in 
Latin or biology. The interest in this study was primarily in the 
question of how well the scores on the Indiana Composite Achievement 
Test would foretell ability to do high school work in general and not 
in any special subject in high school. 

The high school tests used were the Tressler Minimum Essentials 
in English Tests, White Latin Test, Douglas Diagnostic Tests for First 
Year Algebra, and Ruch-Cossman Biology Test. 

Twenty-two schools in 10 counties, well scattered over the state, 
were used in this part of the study. The standard tests were given 
by the high school principals and the high school marks were copied 
from the pupils’ final record. All marks were reduced to the per cent 
basis. 

The coefficients of correlation between the Indiana Composite 
Achievement Test scores and success in high school work as measured 
by the average mark were obtained for each school having 20 or more 
ninth year pupils, and for the whole group of pupils taken together. 
The correlations for individual schools varied from .414+.098 to .888+ 
.036. The mean of these coefficients of correlation was .735. These 
results are given in Table XLIV. 
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TABLE XLIV.—Coerricients oF CORRELATION BETWEEN INDIANA CoMPOSITE 
ACHIEVEMENT Test, Form A, Scores AND AVERAGE HicH Scuoot Marks 
IN TEN ScHOOLS WITH TWENTY OR More Puptmts IN THE NintH GRADE 


























School r P. E. Number of 
Pupils 
Fae ler eee Sth, «ae eR I .830 .040 24 
eck cates ts hb ese .683 .075 30 
ee. aed a cere thas cowemcned .806 .044 28 
es oe Seah ee ee ed 888 036 20 
Be hk ree ys es We TEE eee .755 054 30 
phe eg SP ee 414 098 30 
Ny ns FS A a a et 6198 096 20 
ee cake ee ass 870 034 31 
rt Ee, ES. 717 063 27 
SURES a Hee Rages IE RFS 801 044 33 








The coefficient of correlation for all of the 339 pupils was not so 
high as the mean correlation of the 10 schools given above. This is 
given in Table XLV. 


TABLE XLV.—Coerrictents or CORRELATION BETWEEN THE SCORES OF THE 
INDIANA ComposITE ACHIEVEMENT TEsT, Form A, AND Success 1n HiGcH 
Scnoot Marks as MEASURED BY THE AVERAGE MARK 

















School | r P. E. Number of 
Cases 
ON a ee. Pa .652 .021 339 
Mean of ten schools............. .735 .058 27 








The other saeasure of success in high school work which was used 
was the scores made on standard tests in high school subjects. Since 
English and algebra tests were taken by all pupils, the results of the 
scores made on these tests were correlated with the scores made on the 
Indiana Composite Achievement Test, Form A. These are given in 
Table XLVI. 
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TABLE XLVI.—Coerricients or CORRELATION BETWEEN Scores OF INDIANA 
Composite ACHIEVEMENT TEST WITH SCORES ON STANDARD TEsts IN H1GH 
Scuoot Sussects (10 Schools) 


School Subjects r LL 2. SB. Number of 
| Pupils 
| ew igs Gaeiee teary 
| ere English .661 .078 24 
Slt ......| Algebra | .678 | .074 | 24 
ee : | English .379 .107 30 
Bs oe en | Algebra 485 .096 30 
a ....+.+.}| English .589 .083 28 
_ eae —o | Algebra 719 | .060 28 
} 
ae Fr a | English 874 | .036 20 
er ' Satay Algebra . 664 .086 20 
Rate veccsesee:| English | .50n | 092) | 30 
ere Cie. | Algebra 547 .092 30 
pes English .622 .078 30 
he cones - . | Algebra -115 .122 30 
ii cas eee: See) a er 20 
| ae ; aaa ; Algebra | .658 .076 20 
| | 
_ ; ota Pele 4 Bew a « oe | .710 .062 31 
es a ; Vr) | Algebra .617 .078 31 
RS Fae! ..| English 877 .030 27 
el Fierce Soda Algebra | 487 .093 27 


It will be noted in Table XLVI that the coefficients of correlation 
found in the different schools vary widely; in English, from .379 + .107 
to .877+.030, and in algebra from .115+.122 to .719+.06. The co- 
efficients of correlation for English are much higher on the average than 
those in algebra. 

To determine the relationships between these measures for the 
group of pupils as a whole, the coefficients of correlation were de- 
termined for all the cases tested, both by the single subject scores and 
by the scores made by each pupil in all three’ tests taken together as 
a composite score. These results are given in Table XLVII. 


1The pupil was given tests in English, algebra, and biology or Latin. The latter 
two were considered as one subject since pupils took either Latin or biology. 
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TABLE XLVII.—Coerrictents or CORRELATION BETWEEN THE SCORES OF 
INDIANA ComposITE ACHIEVEMENT Test, Form A, AND Success 1n HicH 
ScnHoot Work As MEASURED BY STANDARD Test Scores In VARIous HiGcH 
ScHoo.t SUBJECTS AND BY A ComposiTE or HicgH Scuoot Test Scores 


























| 
Subjects r P. E. Number of 
Cases 
et if 
Di ses FS. ere 671 019 339 
RMN, > 03 ae's «0.2 ea A aepctet .532 .029 339 
ii ARE REE 665 .023 254 
DM oie co odcetde cece | .294 035 85 
Composite? of all subjects....... | 551 .026 339 








It seems apparent from the coefficients of correlation of the zero 
order of the Indiana Composite Achievement Test, Form A, with (a) 
average of marks in the first year of high school, and (b) the composite 
standard test scores in high school subjects, that the value of the total 
composite score of the Indiana Composite Achievement Test is somewhat 
limited as a prognostic measure of success in high school work. 

However, the coefficients of correlation between total composite score 
and average ninth grade work in two high schools were high enough 
to make the Indiana Composite Achievement Test of real value in that 
function. The prognostic value of the total composite score of the Indi- 
ana Composite Achievement Test, when applied to the two high school 
subjects of English and biology, when the achievement in those two 
subjects was measured by the use of standard tests, shows a significant 
correlation. In the other two high school subjects of Latin and alge 
bra the coefficient of correlation was not high enough to be of much 
value in prognosis. 

An attempt to determine whether certain traits and extra-curricular 
activity might bear a relation to the pupil’s achievement was made. To 
do this, each of the 339 pupils was rated by his high school principal 
and teachers on five traits.’ 

These ratings were made in February of the school year, at approxi- 
mately the same time that the standard tests in high school subjects 
were given. The teachers had already given the first semester marks, 
but did not know the results of standard tests in high school subjects. 
The use of the ratings here are somewhat fallacious for two reasons: 
(a) these are not ratings of grade teachers and could not be used in 
promotion, and (b) these traits were no doubt influenced by the pupil’s 
achievement in high school. 

The individual ratings and the total ratings on each item were 
correlated with (a) scores of the Indiana Composite Achievement Test, 
(b) average mark, and (c) achievement in high school subjects as 
measured by standard tests. The zero order coefficients of correlation 
are given in Table XLVIII. 


? The composite score was obtained for each pupil by dividing his score in each 
test by the standard deviation, then adding these resultant scores together. 
* Rating Card is shown in the appendix. 








74 BULLETIN OF THE SCHOOL OF EDUCATION 


TABLE XLVIII.—Coerricients or CORRELATION BETWEEN RATINGS oF PupiL 








| Indiana Cun- | High School | High School 








| posite Achieve- | Average | Achievement 
Traits ment Test Marks Tests 
r | P.E. r ez. 7 r P. E. 
| | 

pO eer ree , .6312 | .022 | .694 | .020 | .610 .025 
Industry................| .139 | .088 | .504 | .025 491 .029 
ee 352 | .031 | .548 | .026 | .398 | .032 
Interest... . ; | .427 | .028 | .604 | .023 | .418 .031 

Extra Curricular Activi-| j 
a... ...| .413 | .030 | .431 | .028 | .400 | .031 
Total Rating...........| .499 | .029 | .663 | .021 | .351 | .031 


The correlation between the rating of the trait “ability” and the 
other measures was higher than that of the Indiana Composite Achieve- 
ment Test score and success in high school work as measured both by 
average mark and standard tests. 

A significant correlation is found between the Indiana Composite 
Achievement Test and the trait “industry.” The correlation is low, .139, 
and, if the trait was properly measured by this rating, it bears very 
little relation to scores made in the Indiana Composite Achievement 
Test. This trait has a correlation of .594 with average high school 
mark and .4909 with achievement in high school as measured by stand- 
ard tests. 

By combining the scores of the Indiana Composite Achievement Test 
with rating on industry by the method of: multiple correlation 





Ir2e, + rey — Breyer ie 


Re.2= ¥— 





[— 








Re...= _|058)" + (604)? — ® X 668 X 604 x .139 
ere 7— (139)? ae 


Re.:.=.827 + 014 


This is the maximum correlation with average high mark that can 
be obtained by combining these two measures. This correlation is sound 
only on the assumption that the grade teacher would rate the pupil on 
industry as the high school principal and teachers have rated him. 
While this assumption is not necessarily sound, nevertheless the indi- 
cation is that rating of the trait of industry by the grade school teacher 
and the combination of that with the score on the Indiana Composite 
Achievement Test would raise the predictive value of the measure. 

If this rating were used, how might it be combined with the Indi- 
ana Composite Achievement Test score? The relative weighting on the 
basis of the data used here would be obtained by the formula, Sr = 
S:+WS:. To find W use the formula, 











oe 
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TC2— TCT 2 ol 
W= x 


TC, — TCP 12 a 2 





.594— .652 X .139 103 
W= x-— 
-652— .594 X .139 1.3 





Sr = 8, + 78S 


The partial and multiple correlations using these various traits with 
the Indiana Composite Achievement Test were also worked out. (These 
are all given in the appendix.) All partial correlations of the Indiana 
Composite Achievement Test with high school success, holding constant 
in turn the various traits, were lower than the original correlation with 
the exception of the cases wherein the trait “industry” was held con- 
stant. All multiple correlations between high school success and the 
Indiana Composite Achievement Test score were increased when the 
various traits or all traits were combined with the Indiana Composite 
Achievement Test score. The most significant correlations are given in 
Table XLIX. 


TABLE XLIX.—PartiaL CoRRELATIONS BETWEEN INDIANA COMPOSITE 
ACHIEVEMENT TEsT AND Success in HicH Scuoon, with Various Meas- 
ures Hetp ConsTANT 




















Measures Correlated Traits Held r P. E. 

; Constant 
Re Ry a Ra oer a Industry . 666 .020 
em Re ee oe 5. "ire .| All traits 463 -030 
I. C. A. T. 06. H. 8. standard’ teste. :. 2... 52057. Industry .604 .024 
I.C. A. T. vs. H. S. standard tests............ All traits 441 .031 
eee as Fel eee eee Age .571 025 
I.C. A. T. vs. H. S. standard tests........... Age .502 .029 








The significant facts of Table XLIX are (a) that holding constant 
the one trait of “industry” increases the correlation between the Indi- 
ana Composite Achievement Test scores and high school success, (b) 
that holding constant all traits tends to lower that correlation, and (c) 
that age held constant tends to reduce the correlation only slightly. 
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TABLE L.—Muttrete CorRELATIONS BETWEEN INDIANA Composite ACHIEVE- 
MENT Test WHEN COMBINED WITH OTHER MEASURES AND SUCCESS IN 
Hicu ScHooui 


Traits Combined 
Measures Correlated with Indiana Com- r i + 
posite Achieve- | } 
_ment Test 


I. C. A. T. vs. H. S. marks | Industry | .827 011 
I. C. A. T. vs. H. S. marks | All traits | .753 | .017 
I. C. A. T. vs. H. S. standard tests | Industry 690 | .019 
I. C. A. T. vs. H. 8. standard tests | All traits | .570 | .026 
I. C. A. T. vs. H. 8S. marks | Age | .600 .025 
I.C. A. T. vs. H. 8 


. Standard tests | Age 568 .026 


The significance of Table L is largely in (a) that combining the 
trait “industry” with the Indiana Composite Achievement Test increases 
the correlation markedly and also more than all traits combined with 
the Indiana Composite Achievement Test scores, and (b) that age com- 
bined with the Indiana Composite Achievement Test by this method 
shows little influence on the correlation. 

A comparison was made of the Indiana Composite Achievement Test 
and other measures as an instrument of prediction of success in high 
school. The other measures used were (a) -scores on the Stanford 
Achievement Test, Form A, (b) scores on the Otis Self-Administering 
Intelligence Test, and (c) teachers’ marks made out by the teachers after 
they had given and scored the Indiana Composite Achievement Test.‘ 
Success in high school work was measured by average of all marks in 
four subjects for each semester. These data were available for 54 
pupils who took the Indiana Composite Achievement Test in April, 1927, 
and who attended high school the following year, 1927-28. The correla- 
tions of the measures compared are given in Table LI. 


TABLE LI.—Corre.ation or Various MEASURES WITH AVERAGE or HIGH 
ScHoot Marks 1n Four Suspsecrs ror First anp SEeconp SEMESTERS 
(54 cases) 





Measure | First Semester iSecond Semester 
I.C. A. T. ....| 604 + .060 | .615 + .059 
Stanford Achievement Test... . 434 + .076 | .390 + .081 
Otis Intelligence Test. ..| .3885 = .081 | .450 + .075 
Teachers’‘ marks and I. C. A. T..... | -742 = .048 | .682 = .049 
| 


It seems apparent from this table that teachers’ marks in which 
they used the results of the Indiana Composite Achievement Test with 
‘The ‘teachers’ marks were made out after they had given and scored the Indiana 


Composite Achievement Test, Form A. They were told to use the results of the test 
in whatever way they desired in making out their marks. 
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the mark from the pupil’s daily work shows the best predictive value. 
A comparison was made of the predictive value of (a) teachers’ final 
marks of eighth grade pupils, (b) the Indiana Composite Achievement 
Test scores, and (c) teachers’ marks, using the Indiana Composite 
Achievement Test scores. The final marks of 70 pupils in one of the 
four counties in the state that did not use any final examination were 
obtained. While these pupils are not the same as the other groups 
studied, they were in an adjoining county and the school systems were 
very much the same. Table LII gives these comparisons. 


TABLE LII.—CorRELATIONS BETWEEN VARIOUS MEASURES OF PROGNOSIS 
Success in HicH Scuoot Work as MEASURED BY AVERAGE OF First 
SEMESTER Marks 1n Hiagu ScHoou 





| 
Measures | r | P.E. |Number of 
| Cases 
I. C. A. T. vs. average H. S. marks..........|  .594 | .060 54 
Teachers’ marks and I. C. A. T. vs. average | 
H. S. marks... 742 | 045 54 
Teachers’ marks vs. average H. 8. marks. .. . | A471 | 060 70 
! 


If these data were all comparable it would seem apparent that (a) 
teachers’ final eighth grade marks alone were not so valuable in prog- 
nosis as Indiana Composite Achievement Test scores, and (b) teachers’ 
marks, when these marks are made out by making use of the Indiana 
Composite Achievement Test scores, are of most value in prognosis. 

Before making any conclusions on the use of the Indiana Com- 
posite Achievement Test as a prognostic measure, it is necessary to 
compare the success of this measure in that function with other meas- 
ures developed by such studies of prognosis as have been made that may 
be comparable. A summary of comperable studies is given in 
Table LIII. 
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TABLE LIII.—Summary or Proanostic Stupies CoMpARED TO RESULTS OF 
THE INDIANA ComposiTE ACHIEVEMENT Test Usep In Tuat FuNcTION 








Author 


Measures 


| 





T. L. Kelley... 
(21) 


C. W. Flemming 
(10) 





7th grade 
marks® 
6th grade 
marks 
5th grade 
marks 
4th grade 
marks 
Teachers’ *® 
estimate of 
traits 


Special tests 


Average of all 
grades, 2-8 
school marks 

Arithmetic, 
grades 7-8, 
school marks 


English, 
grades 4-6, 
school marks 

Effort, 
grades 7-8, 
school marks 


Terman Group 
Mental Test 


Stanford 
Achievement 
Test’ 

Teachers’ 
estimate of 
industry® 

Teachers’ 
estimate of 
school 
attitude 

Teachers’ 
estimate of 





intelligence 


5 Seven-four school organization. 


be 5, 6, 7, and 8. 











Multi- 
ana Raw | ple 
Criterion Cor- | Cor- 
rela- | rela- 
tion | tion 
First year H.S.| .719 
average mark 
First year H.S.| .728 | .789 
average mark 
First year H.S.| .531 
average mark 
First year H.S.| .624 
average mark 
First year H.S.| .58- 76 
average mark| .72 
First year H.S. 51 
average mark 
First year H.S.|) .60 
average mark 
First year H.S.| .53 .56- 
average mark .69 
First year H.S.| .56 
average mark 
First year H.S.| .46 
average mark 
Junior H. 8. 59 .79 
marks 
Junior H. 8. .66 
marks 
Junior H. 8. .69 
marks 
Junior H. 8. .74 
marks 
Junior H. S. .80 
marks 


Traits 
Used in 
Multiple 
Correlation 








All marks 


Ability, inter- 
est, conscien- 
tiousness, 
oral English 
Mathematics, 
English, 
History 


Arithmetic, 7- 
8, English, 4- 
6. Age at end 
of grade 8. 

Days present 
4-6. Effort 7-8 


Mental test 
and school 
attitude 


In the eight-four organization these grades would 


* These estimates were made during the first half of the first year of high school. 
‘This test was given at the end of the year’s work at the same time that the 


marks were 


obtained. 


®* These were obtained for the same school year for which marks used in criteria 
were obtained and by the same teachers giving the marks. 
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TABLE LIII.—Continued 



































Multi- Traits 
Raw | ple Used in 
Author Measures Criterion Cor- | Cor- Multiple 
rela- | rela- | Correlation 
tion | tion 
Teachers’ Junior H. §. 59 
estimate of marks 
eleven traits 
W. W. Wright. .| Indiana Ninth year .65 | .82 | Indiana 
Composite average mark Composite 
Achievement Achievement 
Test Test and rat- 
ing on indus- 
tr 
.74 | Indiana 
Composite 
| Achievement 
Test used 
| with teachers’ 
marks 
.72 | Indiana 
| Composite 
Achievement 
Test and 
} ratings on 
| ability 
Indiana | Ninth year | .67 | 
Composite | English 
Achievement; (test) 
Test 
Indiana Ninth year | -66° | 
Composite biology 
Achievement (test) 
| Test 
Indiana Ninth year | .55 .69 | Indiana 
| Composite average | Composite 
| Achievement (test) | Achievement 
| Test | | Test and 
| | rating on 
Pee industry 








In addition to the data given in Table LIII, there is much infor- 
mation on prognostic measures. In these various studies that might 
in any way be comparable, the zero order coefficient of correlations are 
generally not above .60 and the coefficient of multiple correlation not 
above .80. From these data it would seem fair to assume that the 
Indiana Composite Achievement Test, Form A, compares favorably with 
other prognostic measures of general ability to do ninth grade work. 
The Indiana Composite Achievement Test, when used with teachers’ 
marks or ratings of pupils on the trait of “industry,” shows value as a 
prognostic measure. ; 

To further study the prognostic value of any measure with a given 
coefficient of correlation, it is necessary to give a statement of proba- 
bility of the placement of pupils with that measure. 

Such tables have been computed by Professor E. L. Thorndike and 
are here given, with his permission, for coefficients of .60, .80, and .90. 
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One of these tables shows that, when two measures correlate .80 
and the predictive measure places pupils in the upper 10 per cent or 
first tenth, we may expect 56.2 of these pupils to achieve in the criterion 
measure in that tenth, 23.1 will fall in the next lowest tenth, 11.1 in 
the third tenth, etc. In this manner the accuracy of the various prog- 
nostic measures used here can be interpreted. 

A somewhat more practical measure may be made of these values 
of the Indiana Composite Achievement Test by a study of the cases of 
this study who entered high school and continued thruout the ninth year. 

Of the 339 cases studied there were 13 who made scores on the Indi- 
ana Composite Achievement Test above the highest quartile score given 
in the norm, and who also made an average of below 70 in their ninth 
year work. Such an average mark is taken as an indication of failure 
to achieve in high school work. These 13 pupils had ratings given by 
their high school teachers and principal as follows: 5, 4, 4, 4, 4, 5, 5, 
5, 5, 4, 4, 5, 3 (mean rating 3, highest 1, lowest 5). Six pupils were 
rated 5 or lowest, six were rated 4, and one was rated 3. There seems 
to be rather convincing evidence that these few cases, who were in the 
upper 75 per cent of pupils on the Indiana Composite Achievement Test, 
but who failed wholly or partially in ninth grade work, were much 
below the average in industry. 

There were 30 pupils whose scores were in the lowest quartile of 
the Indiana Composite Achievement Test that were in high school thru- 
out the ninth year. Of these 30, 15 failed to make a passing mark in 
one or more subjects. Of the 15 who did not make a failing mark in 
any subject, five were rated 1 (highest) in the trait “industry” by their 
high school teachers and principal. 

The lowest 10 or 15 per cent of pupils, as rated by the Indiana 
Composite Achievement Test, Form A, were probably not in high school. 
With this lower end of the.distribution no measure of high school success 
was attainable. This condition in a small way lowers the coefficients 
of correlation between the Indiana Composite Achievement Test and the 
measure of success in high school. No account has been taken of this 
fact in the correlations here given. 


Summary 


1. The Indiana Composite Achievement Test, Form A, used as a 
prognostic measure of general success in ninth year work, when that 
success was measured by an average of two semester marks in three 
subjects, shows only a fair degree of success. The zero order coefii- 
cient of correlation obtained was .652 + .021. 

2. The Indiana Composite Achievement Test, Form A, used as a 
prognostic measure of success in ninth grade work: when that success 
- Was measured by standard tests in high school subjects, shows only a 
fair degree of success with English and biology. The zero order of 
coefficients of correlation were .671 + .019 and .665 + .023. The zero 
order of coefficients of correlation between total scores in the Indiana 
Composite Achievement Test and algebra test scores was .532 + .029 
and between the Indiana Composite Achievement Test and Latin test 
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scores it was .294 + .045. The last two correlations given are of lit- 
tle value in prognosis. 

8. The rating of ninth grade pupils on traits by high school teach- 
ers and principals, correlated with success in high school as measured 
by marks and standard tests, shows significant correlation between the 
trait “ability” and success in ninth grade. 

4. The Indiana Composite Achievement Test, Form A, scores, when 
combined with the rating of the trait “industry” by the method of 
multiple correlation, correlates .827 + .011 with success in the ninth 
grade as measured by average school marks. 

5. The Indiana Composite Achievement Test, Form A, shows a 
higher predictive value than (a) the Stanford Achievement Test, (b) 
the Otis Self-Administering Test of Mental Ability. 

6. Teachers’ marks, made out with the use of the Indiana Com- 
posite Achievement Test, Form A, scores, shows a higher predictive 
value than the Indiana Composite Achievement Test used alone or marks 
used alone. 

7. The Indiana Composite Achievement Test, Form A, used with 
teachers’ marks or with a rating on industry, would be of value as a 
prognostic measure. 

8. The prognostic value of the Indiana Composite Achievement Test, 
Form A, compares favorably with reported data from other prognostic 
measures developed. The measure of prognostic value used is the co- 
efficient of correlation, both zero and multiple, between measures that 
could be taken at the end of the eighth year and success in ninth grade 
work. 

9. The pupils who ranked in the lowest 25 per cent of the dis- 
tribution of the Indiana Composite Achievement Test, Form A, and who 
went into high school, showed a marked tendency to fail in one or more 
subjects unless they were rated high on the trait “industry.” 
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Appendix 


EXAMPLES OF ESSAY TYPE QUESTIONS USED IN 
AMERICAN HISTORY INDIANA STATE EXAM- 
INATIONS FOR EIGHTH GRADE 


EIGHTH GRADE 
(Answer any six.) 


1. Tell of the construction of the Panama Canal. 

2. What did the United States contribute to the winning of the World 
War? 

3. Discuss the development of American industries. 

4. Write a paragraph on capital and labor. 

5. What are some of the problems our nation has yet to solve? 

6. What importance is attached to the war with Spain? 

7. Identify: Herbert Hoover, Charles G. Dawes, Charles E. Hughes, 
William J. Bryan. 

8. What is meant by arbitration? Name one question settled by arbi- 
tration. 


UNITED STATES HISTORY 
(Any six.) 


1. Contrast the New England colonies with those of the South, taking 
Massachusetts and Virginia as types. 

2. How was the new constitution different from the old Articles of Con- 

federation? 

What effect did the Erie Canal have on life in the West? 

What was the cause of the panic of 1837? Of 1873? 

5. Which was more important, the Monroe Doctrine or the Missouri 
Compromise? Give reasons for your answer. 

6. What is the “spoils system?” Is it a good policy? Why? 

7. What is meant by “presidential electors?” How many does Indiana 
have? How are they chosen? 

8. What was the Stamp Act? Why did the Americans object to this 
act? 


m CO 


QUESTIONNAIRE SENT TO STATE DEPARTMENTS 
OF EDUCATION 


July 17, 1926. 
To the State Superintendent of Public Instruction: 
Dear Sir: 
The County Superintendents’ Association of Indiana, codéperating 
with the Bureau of Research of Indiana University, is endeavoring to 
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find a more satisfactory method of promoting grade pupils into high 
school. In attacking this problem we are anxious to learn what is being 
done in other states, so I am asking each State Superintendent to give 
me a brief outline of the method used in his state to determine who 
shall enter high school. If this method or plan is in printed form, I 
shall be glad to receive the publication in which it is contained. If you 
are interested in a summary of what each state is doing, so indicate 
and I will send you the results of our investigation along with any 
plan we may adopt. 
A few of the questions I would like to have answered are: 


1. What use, if any, is made of standardized tests? (achievement and 

intelligence) 

What weight is given the recommendations of the teacher? 

3. Do you have examinations based on questions sent out from the State 
Department? If so, how many? Send list of questions if available. 

4. Who grades the papers? 


fe 


Very sincerely, 
RR TERRE SE SO, ay ea 


STUDENT RATING CARD 


Please rate each of your ninth grade pupils, who took the standard 
tests during the month of February, on each of the characteristics listed 
below, in this manner: Think of the ninth grade pupils that you have 
taught. Think of the one that was most industrious. He would rate 
1 in industry on this scale. Think of the laziest ninth grade pupil that 
you have ever taught. He would rate 5 in industry on this scale. Set 
up similar standards for each of the points mentioned, then rate your 
present ninth graders according to these standards in each item. 
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TABLE I.—Partiat CoerFicIeENts oF CORRELATION BETWEEN THE INDIANA 
Composite ACHIEVEMENT Test AND Semester Marks, Various TRAITS 
AS MEASURED BY THE Ratinc Carp Betnc Heip Constant 1n TuRN 
(N = 339) 











Trait Held Constant Partial r P. E. 
Teacher's rating on ability... ... 2.2... .cesescceese . 288 .0350 
Teacher’s rating on industry....................++- .666 .020 
Teacher's tating on attitude... 2... 6... cee cede .507 .0292 
Teacher’s rating on interest... .........-..0ecceees .472 .0290 
Teacher’s rating on ability, industry, attitude, in- 
terest, honesty, and extra-curricular activity..... .463 .030 














TABLE II.—Mottiete Coerricients oF CORRELATION BETWEEN THE INDIANA 
ComposITE ACHIEVEMENT Test TAKEN TOGETHER WITH TEACHER’S RATING 
on Various TRAITS AND AVERAGE SEMESTER Marks (N = 339) 











Trait used with I. C. A. T. Multiple r P. E, 
Teacher's rating on ability: ... 22. 05 sos jvcencseals .723 .0199 
Teacher’s rating on industry.....................6- .827 O11 
Teacher's rating on ateitude.... .. 6c cies ces ves .701 .0211 
Teacher’s rating on interest....................... 412 .0206 
Teacher’s rating on ability, industry, attitude, in- 
terest, honesty, and extra-curricular activity..... .753 .0177 














TABLE III.—Partiat Coerricrents oF CORRELATION BETWEEN THE INDIANA 
Composirs ACHIEVEMENT TEST AND Composite SCORE ON STANDARD TESTS 
IN Hien Scuoot Supsects, Vartous Traits AS MEASURED BY THE PUPIL 
Ratine Carp Berne Heitp Constant (N = 339) 











Trait Held Constant Partial r P. E. 
Teacher's rating Of GDUIGY: «oo. 65. oo eee ke ovo se ves . 264 .0364 
Teacher’s rating on industry....................-.. .604 .024 
Teacher’s rating on attitude.....................44. .502 * 0292 
Teacher’s rating on interest... .... 2.66.2 seecccsee 454 .0311 
Teacher’s rating on ability, industry, attitude, in- 
terest, honesty, and extra-curricular activity..... 441 .0315 
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TABLE IV.—Mottiete Coerricients oF CORRELATION BETWEEN INDIANA 
Composire ACHIEVEMENT Test Scores TAKEN WITH TEACHER’S RATINGS 


on Various Traits AND Composire Scores ON STANDARD Tests IN HicH 
Scuoo.t Sussects (N = 339) 

















Trait Used with I. C. A. T. Multiple r PB. 
Tenemer’s rating on ability. .........6.ccccsccssvece .597 .0249 
Teacher’s rating on industry......................- .690 .019 
Teacher's rating on attitude.............cccccccccs .588 .0254 
Teacher’s rating on interest...................+--- .588 .0254 
Teacher’s rating on ability, industry, attitude, in- 
terest, honesty, and extra-curricular activity..... .570 .0264 
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